GlobalHydrologyLab / AquaSat

Monitoring water quality from space!
MIT License
47 stars 16 forks source link

project structure proposal #1

Closed aappling-usgs closed 7 years ago

aappling-usgs commented 7 years ago

@matthewross07 here's a proposed project skeleton with some starter files.

One core idea is that there's some general direction of data flow from raw data (folders starting with 1_) to a paper/proposal/presentation/etc (folder[s] starting with 9_), and that the scripts, configurations, and outputs of those steps can be stored in numbered folders to broadly describe that flow. Soon I expect we'll add numbered folders like 3_combine or 5_model, but I'm leaving those out for now because I don't know what the right grouping or naming is.

The other big idea that I'd love to try here, if you're game, is to use remake (https://github.com/richfitz/remake) and scipiper (https://github.com/USGS-R/scipiper) to formally describe and direct the flow of data. I wrote more about scipiper in the README, and there's a simple working example in this pull request (PR).

I got the list of possible chlorophyll codes in 1_wqdata/cfg/wqp_codes.yml from somebody else's old WQP code. Are any of those of interest, or do you just want to stick with "Chlorophyll a"? I don't know yet about the frequency of observations from each code, but I will find that out soon.

matthewross07 commented 7 years ago

I like the proposed directory, using make and following scipaper rules. For the more complex chlorophyll names pull, we might as well pull all of that down at lest for LA and WI and we can see if that adds any information, though I'm skeptical we have enough resolution. If any of this data overlaps with Hyperion flyovers then it would be great to have that extra info.