ices-eg / DIG

ICES Data and Information Group
Creative Commons Zero v1.0 Universal
3 stars 1 forks source link

4 - Decision tree for hosting of datasets outside core systems #5

Closed sjurl closed 1 year ago

sjurl commented 3 years ago

Draft decision tree and ground rules for establishing hosting of datasets outside of existing core systems or existing formats. Includes data visualisation, hosting, and other material that does not fit into a current resolution or request.

neil-ices-dk commented 3 years ago

Draft ready for May 2021 (ahead of DIG); Jens is the lead; Carlos, Laura, Hjalte, Colin

neil-ices-dk commented 3 years ago

Neil is working on a checklist based on the IODE best practice that would be operationalised as an online questionnaire, this will be ready for DIG in May and we will test it with WGEXT beforehand; will post link when ready

neil-ices-dk commented 3 years ago

Neil is working on a checklist based on the IODE best practice that would be operationalised as an online questionnaire, this will be ready for DIG in May and we will test it with WGEXT beforehand; will post link when ready

@jensr and @sjurl this also relates to #12 so not sure if it belongs on that post or this one. The user story: to give an ICES group clear and plain questions that form a checklist that they can fill out to ascertain whether their dataflow/product is sufficiently robust to be used in ICES advice and be given an internal 'stamp' of approval

solution - so far a schematic, but also working on an online survey form schematic draft of online form (not complete)

neil-ices-dk commented 3 years ago

updated after initial feedback to encapsulate more specifics on versioning/instancing: schematic v.03MAY image

jensr commented 3 years ago

@neil-ices-dk Thanks for progressing this. I think it does link with #12 and probably #13 as well. The structure of both should really be aligned to make sense to submitters/users. Apologies for not having moved this much forward myself - some great progress both with flow diagram and format. Main thing from the handbooks that needs to be incorporated is probably related to #3 in terms of clarity about licensing.

neil-ices-dk commented 3 years ago

From DIG meeting 18th May 2021

volunteers to review within 2-3 weeks:

neil-ices-dk commented 3 years ago

from breakout groups: group 1 group 2 group 3

neil-ices-dk commented 3 years ago

DIG suggested testers

who have products/datasets ready to trial via such a data profiling tool:

neil-ices-dk commented 3 years ago

an updated version of the renamed 'Data Profiling Tool' (ver June 2021) is now ready; this version incorporates the feedback from the 3 sub-groups to the extent that is feasible.

PDF version (2 pages) JPEG version (1 page)

next steps:

neil-ices-dk commented 3 years ago

accessions@ices.dk will be the default contact email for help to start with

neil-ices-dk commented 3 years ago

we now have 5 entries in the DPT and we need to think about the evaluation process urgently as we will need to advise on how to process these entries. For DIG mid-term meeting

jensr commented 3 years ago

@neil-ices-dk is there a way to access content in the DPT to get a feel for format? This might help with the evaluation steps and condensing down responses in an easier way.

neil-ices-dk commented 3 years ago

@jensr I have downloaded the 9 responses, as of 21 SEP, to an excel file here

neil-ices-dk commented 3 years ago

challenges

next steps?

review core group

Neil, Sjur, Jens, Laura

workshop

neil-ices-dk commented 3 years ago

@neil-ices-dk to contact Andy Kenny to fill the WGINOSE map tool, to start a dialogue on how this type of visual product can be formalised @neil-ices-dk to give access to core group to view and edit the DPT

neil-ices-dk commented 2 years ago

@jensr and @sjurl it looks like i can give you access to edit/view the survey if you have an office365 account and are prepared to give me your account name

image

neil-ices-dk commented 2 years ago

planning for 1st meeting of core group in November 5th; have discussed in the Ecosystem Overviews Operational Group about the next steps; they are keen to include this into the review stage at the Advice Drafting Groups i.e. that the supporting dataflows or products have been included in the DPT and reviewed by the DPT Core Group.

neil-ices-dk commented 2 years ago

The list of completed entries is now available as a sharepoint list - which will be the effective archive of entries to the DPT.

DIG DPT Archive

I will (at some point) create a workflow that adds new entries from the form to this list (automation fun...)

neil-ices-dk commented 2 years ago

5th November Review

Decided to create a worksheet view for each transposed entry and then flag with colours according to below and add column for comments from DIG. The idea being to give an overall evaluation. We managed 5 entries and covered discussions about the process and changes to survey form questions etc. below. We will hold a 2nd meeting 29 Nov to finisht the other 5 entries, and then go back to SGCHAIR Debbi, ICES PO Julie and ACOM Henn with feedback to the entries.

updated spreadsheet

After the meeting, Jens developed a template that we can use to formulate feedback to the form fillers in an easy to overview way feedback template

Process considerations

- what are the elements that are 'warnings'

  1. metadata record: if no or don't know
  2. URL: check if works/provides a description of the actual dataflow/product
  3. Open licence: if answer don't know then flag
  4. Available in a well described format: No or Don't know raise flag
  5. Link or description of format: check that it is the relevant resource
  6. Access to web resource: check the URL works
  7. quality: no or don't know; recommendation to fill this gap
  8. sampling methods/collection: if no or don't know recommendation to fill this

- what are the elements that are 'critical blockers'

  1. licence: no or don't know
  2. URL or text: needs to be included
  3. Link or description of format: check that it is filled, exists
  4. documentation/scripts: if no or don't know

- what are the elements that are 'need further information'

  1. update cycle: need to ask if they don't know

Changes to DPT questions/text

  1. Could suggest a different phrasing that asks if the data are ready in an international standard or well documented format; also with some examples; to avoid that we are offering to 'fix' their data standard

Suggest a 'warning' text at the start of the form filling to ask submitters to double check URL's etc. rather than blindly pasting in URL's. To note 'don't know' answers will automatically trigger more questions to understand how this can be resolved. Add warning before 'submit' form

neil-ices-dk commented 2 years ago

was having problems with check-out/in, so a slightly modified working template for meeting on Monday https://community.ices.dk/Committees/DIG/Master%20documents/Data-Profiling-Tool/Revised-Data%20Profiling%20Tool_%20Template_Summary%20Sheet.xlsx

neil-ices-dk commented 2 years ago
neil-ices-dk commented 2 years ago

will run through this in DIG Inter-sessional, there are some follow-up tasks for the DIG sub-group as the roll-out of the DPT is picking up pace

https://community.ices.dk/Committees/DIG/Master%20documents/Data-Profiling-Tool/Data-Profiling-Tool-Progress-EcosystemOverviews.pptx

neil-ices-dk commented 2 years ago

to do's

sjurl commented 2 years ago

Current work is being carried out in https://github.com/ices-eg/DIG/issues/311 and https://github.com/ices-eg/DIG/issues/306

sjurl commented 1 year ago

Work being carried forward in #311 and #306 closing this task