NIEHS / beethoven

BEETHOVEN is: Building an Extensible, rEproducible, Test-driven, Harmonized, Open-source, Versioned, ENsemble model for air quality
https://niehs.github.io/beethoven/
Other
4 stars 0 forks source link

Naming conventions and lookup tables facilitating collaboration #318

Open kyle-messier opened 6 months ago

kyle-messier commented 6 months ago

Hi @sigmafelix @eva0marques @mitchellmanware @dzilber @Sanisha003 @larapclark @dawranadeep

Overview

Following some discussion with the group and a detailed meeting with @sigmafelix and @eva0marques, we have updated the README file for the project that will help us be consistent and reproducible as we continue to move forward with this large project and model development. I'll highlight a few key things here and then make an action item list with proposed people to handle it.

Targets Migration

We are moving towards a reproducible targets pipeline. In doing so, all of the objects developed during the pipeline will be stored in the _targets folder, which will be part of the .gitignore and not stored on GitHub. The targets pipeline connections will automatically handle where particular types of data or objects as opposed to a folder hierarchy. Currently, we have a hierarchy such as input/, input/raw/, output/nlcd/, etc. Additionally, we have to maintain different file types and potentially worry about old and new, updated versions. That protocol is currently implemented, but will be deprecated once we move to the targets pipeline.

Naming conventions for the target objects. See the section on the README. Please suggest things we may have missed.

Naming conventions for functions.

Folder Structure

Brief descriptions of the folder sections are needed:

Punchcard

@sigmafelix should we rename this to lookup_table for consistency with other similar things in the literature?

@sigmafelix

sigmafelix commented 6 months ago

@kyle-messier I think lookup_table looks okay, but I am a little concerned about confusion between this file and other tables serving for "lookup". We used to have nlcd_classes.csv for lookup in the NLCD feature calculation function, which is now transferred to amadeus. If we are sure that there will be no similar table(s) for lookup in beethoven, I will change the file name lookup_table. I would suggest pipeline_configuration as another candidate.