Closed ethanwhite closed 2 years ago
I kinda like the giant repo idea, as long as its well documented? These are all steps to taking imagery and turning it into ecological data.
I kinda like the giant repo idea, as long as its well documented? These are all steps to taking imagery and turning it into ecological data.
As in one big repo that includes everything in the Zooniverse folder or you think we should leave it all here along with all of the field data?
If the counts from the imagery will go in this repo as 'data,' seems fine to keep it in here. However, that means you will have to deal with the data structure of this repo. All PRs get tested for new data additions. While you don't have to fix data-related errors, you should at least let us know by opening an issue and/or via slack. And every commit needs to have versioning instructions and should pass the versioning step in the CI. eg #48 should have had a [no version bump]
tag. So you might want to separate it if you don't want to deal with all that.
Also, it seems like it might be confusing for folks who come here looking for data to have Zooniverse/bird-bird-bird details in the top-level readme. Seems like those should go in a Zooniverse/README or something.
I don't think this should all be in the same repo. The smell test for me is that we would have completely different people working on two distinct pieces of the repo, with no meaningful overlap in issues, testing, code, etc. Keeping them together therefore causes challenges with testing (reflected in @gmyenni's point about testing, @bw4sz's confusion a few days ago about an R test error on a change to a Python file, and the fact that we aren't currently running the Python tests), reading the history (which makes looking up changes to the living data harder), supporting different user bases (@gmyenni's point about the README), etc. We can definitely setup a system to push data derived from remote sensing into this repo or have this repo pull it from somewhere else like we do with weather in PortalData.
So, I think that the Zooniverse directory does need to be split out and I agree that minimizing further splitting makes sense. So, barring objection, I'll plan on splitting out the Zooniverse directory into it's own "everglades tools" repo for now. I'll keep the history for the folder in that repo. Going this route the question is whether we want to rewrite the history for this repo to remove the commits related to the Zooniverse directory.
Does that all sound good to folks?
sure.
On Wed, Sep 15, 2021 at 8:11 AM ethanwhite @.***> wrote:
I don't think this should all be in the same repo. The smell test for me is that we would have completely different people working on two distinct pieces of the repo, with no meaningful overlap in issues, testing, code, etc. Keeping them together therefore causes challenges with testing (reflected in @gmyenni https://github.com/gmyenni's point about testing, @bw4sz https://github.com/bw4sz's confusion a few days ago about an R test error on a change to a Python file, and the fact that we aren't currently running the Python tests), reading the history (which makes looking up changes to the living data harder), supporting different user bases @.*** https://github.com/gmyenni's point about the README), etc. We can definitely setup a system to push data derived from remote sensing into this repo or have this repo pull it from somewhere else like we do with weather in PortalData.
So, I think that the Zooniverse directory does need to be split out and I agree that minimizing further splitting makes sense. So, barring objection, I'll plan on splitting out the Zooniverse directory into it's own "everglades tools" repo for now. I'll keep the history for the folder in that repo. Going this route the question is whether we want to rewrite the history for this repo to remove the commits related to the Zooniverse directory.
Does that all sound good to folks?
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/weecology/EvergladesWadingBird/issues/50#issuecomment-920109076, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAJHBLABOTEXXBL6A2P5UVLUCCZRNANCNFSM5EALIGEA . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.
-- Ben Weinstein, Ph.D. Postdoctoral Fellow University of Florida http://benweinstein.weebly.com/
@gmyenni - if you're on board to can you let me know whether you want to rewrite the history for this repo to drop all of the commits that occurred only in the Zooniverse
directory.
Yep, that's fine with me. You can rewrite the history. So does the App/ directory go too?
Yeah, probably.
@bw4sz - does anything in the Shiny app rely on what's in Zooniverse?
I have split out the App and Zooniverse directories into their own (combined) repo with all of the structure identical to the current setup: https://github.com/weecology/EvergladesTools
I've also transferred all of the relevant issues to that repo. Take a look and see what you think. If folks decide they want to separate the shiny app out into it's own repo I can do that as well.
Super population model work is now here: https://github.com/weecology/SuperPopulationModel
The Zooniverse directory currently contains code for:
I think that some of this probably belongs in separate repo's since this is the equivalent of https://github.com/weecology/PortalData. So, I wanted to start a discussion with at least @gmyenni & @bw4sz about what (in the long-run) we think should be moved, what chunks we should split it into, and what (if anything) makes sense to keep in this repo. Thoughts?