nc-minibbs / mbbs

A repository for the Mini-Bird Breeding Survey data
https://minibbs.us
Other
2 stars 0 forks source link

Clarify the data pipeline #70

Closed bsaul closed 1 month ago

bsaul commented 5 months ago

@IJBG - I started putting something down in docs/data.md

bsaul commented 1 month ago

@IJBG : regarding the scope of this PR: I'd like to get docs/data-pipeline.md at least roughed in.

bsaul commented 1 month ago

@IJBG : regarding the scope of this PR: I'd like to get docs/data-pipeline.md at least roughed in.

I realize the PR as is currently has quite a bit more than that, so I may need to revert some stuff.

bsaul commented 1 month ago

at least roughed in.

By "roughed in" I mean at least describing all the source and derived datasets, though we don't need to have every detail pinned down..

bsaul commented 1 month ago

Minor but mentioning it here. The directories in stop_level_data don't have a common scheme (e.g.): Hurlbert_stops, Patsy_Bailey_stops, Pippen_Durham_stops.

bsaul commented 1 month ago

@IJBG - I think this is a fine sketch to start work towards #93. What do you think?

I think we should not merge this branch into master. Instead, I think we might create a draft repo-reorg branch where the work in this branch would be the start. Then any work towards reorganization would target that branch until we're satisfied with the reorganization and then merge into master. I just think that updating master as we go might cause headaches for projects like the mbbs website, as we're likely to be breaking things.

IJBG commented 1 month ago

@bsaul Yeah, I think this is a good base!

I agree about not merging into master - maybe at the very end there will be some merge conflicts with files that are modified in master, and then also modified + moved in this branch (eg. fixing the filepaths for read_csv) but it's fine to handle those in a batch at the end. And with step (c) in the build process, yeah, better to keep it its own branch and not break the website in the meantime.

bsaul commented 1 month ago

Cool. I'll create a new branch from master, draft PR, and merge this branch into that one.