GFDRR / rdls-spreadsheet-template

A template for entering Risk Data Library Standard (RDLS) metadata in spreadsheet format
3 stars 0 forks source link

[Spreadsheet template] Review structure and decide whether to omit fields or enforce 1:1 relationships #1

Closed duncandewhurst closed 10 months ago

duncandewhurst commented 11 months ago

@stufraser1 @matamadio I've generated a basic spreadsheet template using Flatten Tool so that you can get a sense of the scale and structure of a template which contains all the fields in the JSON schema and which permits one-to-many relationships for all arrays.

Please review it and let me know your thoughts on whether you want to omit any fields or enforce 1:1 relationships for any arrays.

This spreadsheet represents the basic Flatten Tool output and doesn't include any of the other features discussed in the scoping document (field metadata; data validation; formatting; worksheet grouping; ordering and colouring etc.)

matamadio commented 11 months ago

Thanks, I'll look through and share comments.

matamadio commented 11 months ago

Looks good. I don't think it's worth to strip out any tab; if some are not used, could those be removed by the user?

stufraser1 commented 11 months ago

Otherwise, happy to continue with this and descriptions will certainly help

duncandewhurst commented 11 months ago

Looks good. I don't think it's worth to strip out any tab; if some are not used, could those be removed by the user?

Yes, users can remove sheets. We can include instructions to that effect in the documentation.

  • On 'datasets', could 'temporal' sit with spatial and publisher, creator all sit together to group similar fields?

Yes. However, the fields are ordered according to their order in the RDLS schema so we should fix the ordering there. That way, anything that we generate from the schema (such as the schema browser and reference tables in the RDLS documentation) will also have the desired ordering. I'll open an issue on the main RDLS repo.

* What does the '0' refer to in the field names?

It indicates that each row under that field path should be interpreted as an item in an array, e.g. under attributions/0/id the first row will be interpreted as the id of the first item in the attributions array and the second row will be interpreted as the id of the second item, etc. We can explain this in the documentation.

* Requires some reorganisation of hazards sheets might be helpful, so its more ordered as the hierarchy you would complete it. I.e., hazard_event_sets_hazards before _events before _footprints.

* In fact, reorganisation of all sheets would be helpful - alphabetical order of sheets is maybe less helpful than ordering by component in the logical order someone would complete the metadata.

Sounds good. It should be possible to order the sheets based on the order of the fields in the schema so we can tackle this at the same time as ordering the fields in the schema.

matamadio commented 11 months ago

Please check the enums in the template, they seem to bring the wrong columns. And many tabs are missing the enum lists - I just commented a couple on the online version, but there are many more. immagine

duncandewhurst commented 11 months ago

Ah yes, the codelist validation in everything but the datasets worksheet was out of sync. I've fixed that in the updated version shared in #3

duncandewhurst commented 10 months ago

Closing since the sheets were reordered, field order will be addressed as part of https://github.com/GFDRR/rdl-standard/issues/177, and the readme contains documentation on hiding unneeded sheets.