ioos / ioos-code-sprint

Information about IOOS Code Sprint activities.
https://ioos.github.io/ioos-code-sprint/
MIT License
8 stars 14 forks source link

[Project Proposal]: Provider Parameter Mapper Form #33

Open joe-smithe-glos opened 8 months ago

joe-smithe-glos commented 8 months ago

Project Description

More dropping this before it escapes the brain. As a developer, I want a Javascript form with post-able data that enables a buoy/platform data provider to indicate what parameters their platform(s) are sending and what CF standard names those parameters map to.

One of the long-term goals of GLOS Seagull is to have data providers be able to onboard their own platform without much GLOS help. This parameter mapping is probably one of the biggest hurdles to achieving that in a relatively flawless, and user friendly way

Expected Outcomes

Skills required

JavaScript, knows their way around the CF vocabulary: https://cfconventions.org/

Perhaps some Python or R to generate reference files

Expertise

Intermediate

Topic Lead(s)

joe-smithe-glos

Relevant links

No response

joe-smithe-glos commented 6 months ago

Similar:

https://coastwatch.pfeg.noaa.gov/erddap/convert/keywords.html https://github.com/ioos/ioos-code-sprint/issues/31 ^^ and noted towards 31, but applicable here too: 'AI-assisted fuzzy-matching of user input to "valid" column names could also be explored here.'

cjolson64 commented 6 months ago

What about having a react component library of vocabulary pickers/recommenders for things like CF and GCMD?

joe-smithe-glos commented 6 months ago

@cjolson64 thats a great idea to generalize the tool. I like

7yl4r commented 5 months ago

There is definitely overlap with #31 here. Enough so that we might share a common library between the two projects or consider merging the two, depending on the amount of participants we get.

joe-smithe-glos commented 5 months ago

@7yl4r sounds like a plan

joe-smithe-glos commented 5 months ago

@MathewBiddle referenced NERC: https://vocab.nerc.ac.uk/sparql/

CC: https://github.com/ioos/ioos-code-sprint/issues/31

MathewBiddle commented 5 months ago

Thank you for taking the time to propose this topic! From the Code Sprint topic survey, this has garnered a lot of interest.

Following the contributing guidelines on selecting a code sprint topic I have assigned this topic to @joe-smithe-glos. Unless indicated otherwise, the assignee will be responsible for identifying a plan for the code sprint topic, establishing a team, and taking the lead on executing said plan. The first action for the lead is to:

joe-smithe-glos commented 5 months ago

Thanks, Mat! I'll accept

joe-smithe-glos commented 5 months ago

Coordinate website with https://github.com/ioos/ioos-code-sprint/issues/31

7yl4r commented 5 months ago

I am imagining two separate frontends for this and #31, but one shared package for the javascript running the validation. My javascript knowledge is a bit aged. Last I knew there were tons of opinions on ways to set up a package with no clear frontrunner.

Do you have a javascript package template on hand that you want to start with?

joe-smithe-glos commented 5 months ago

Maybe Next.js framework? Otherwise, there will be a design and story breakdown that will lead up to simple coding of this in whatever flavor of JavaScript we settle on.

cjolson64 commented 5 months ago

I like the idea of using Next.js for building full forms, but I think the end product should be a react component library that can be downloaded via npm into whatever project it's being used in. A good starting point would be something like this

stevenolthoff commented 5 months ago

I would like to participate in this topic. My JS/TypeScript is pretty good, but I would need to catch up on CF vocab.

Would someone mind posting an example user flow? In particular, what kind of parameters might be difficult for a user to find, and how might the suggestions work?

7yl4r commented 5 months ago

So I am hearing a need for three repositories:

joe-smithe-glos commented 5 months ago

@stevenolthoff There are difficult to find parameters (like exactly what length wave flux you want to use for solar radiation) and, likewise, what parameters are the best fit for what your instrument is/instruments are observing. So a provider could type in an idea of what they have and suggestions would pop up with their CF, etc. definitions. Then after some contemplation they pick one and move on.

joe-smithe-glos commented 5 months ago

Rough Pipeline:

joe-smithe-glos commented 5 months ago

Rough Schedule:

  1. Refine problem statement
  2. Generate stories, breakdown, refine
  3. Delegate tasks
  4. Go at it

Anyone want to use a task manager like Pivotal for this or others?

jcermauwedu commented 5 months ago

@joe-smithe-glos It would be nice to have a lightweight metadata editing tool.

Just some random thoughts on specific use cases: (1) define a new platform from the ground up; (2) pre-populate the from a netCDF file by way of the compliance checker, xpublish or ERDDAP endpoint; (3) be able to import/export XML/json-ld from the form; (4) it would be good to have a way for this form to interoperate with the compliance checkers. I recall some discussion about enabling access to the compliance checkers via an API. Granted the form will be doing some of the compliance checker work through utilization of standards/vocabulary tables.

The resultant XML/json-ld files could be used by glider processing packages to create new or update configuration files for processing glider data into netCDF formatted files for upload to the IOOS Glider DAC. The glider operators also provide a "delayed" mode version that needs NCEI approved metadata before it flows into the archive.

Once metadata has been defined and compliance met, then one could think about automatic generation of ERDDAP dataset configuration blocks for those that need them. XML: <dataset>...</dataset>.