microsoft / CameraTraps

PyTorch Wildlife: a Collaborative Deep Learning Framework for Conservation.
https://cameratraps.readthedocs.io/en/latest/
MIT License
784 stars 246 forks source link

Meta-issue: list of open issues, random todo's, and half-baked ideas #331

Open agentmorris opened 1 year ago

agentmorris commented 1 year ago

Sometimes folks ping us to ask how they can contribute code to the MegaDetector project, and we don't really have a place to point them right now. Combined with the fact that a couple of important open issues have been languishing for a few weeks (months?), I got motivated to create this issue as a snapshot of our internal todo list, so I have somewhere to point folks who want to get involved. I'm making only a weak attempt at prioritization here, instead I'm just trying to sort them into logical buckets.

If you're interested in trying your hand at any of these, email us!

Feature additions for existing scripts/tools

Refactoring or re-writing stuff

Infrastructure things

Miscellaneous things that are more exploratory

Other projects that could use your help

If you found this text because you want to work on open-source code related to conservation, and everything I just listed is either too boring or too daunting, please don't give up! Depending on your specific skill set, maybe our close collaborators who maintain EcoAssist, Timelapse, or any of the platforms listed here could use contributions. Or head over to the "Open Source Solutions" forum at WILDLABS, and offer your skills there!

Random models someone should train

Now I'm letting this thread really veer off into a tangent, but FWIW, people frequently ask us "can MegaDetector do [x]?", where [x] is something MegaDetector definitely can't do. But there are some values of [x] that have come up a bunch of times and feel like the right balance of "tractable" and "useful", where there's sort of the right training data in the universe, and a focused student project could really get something going. So, to finish up this long post with lots of random ideas:

patelvyom commented 1 year ago

Another to-do might be to rewrite batch detection scripts to use PyTorch Dataloader instead of managing image I/O manually. This will also allow performing batch inference instead of looping over each image one by one. It should significantly improve inference performance.

agentmorris commented 1 year ago

Updating my response to this suggestion: rather than investing time in using the PyTorch data loader, I'd like to see someone experiment with YOLOv5's native inference tools (val.py and detect.py) as a total replacement for our inference scripts. These have all the benefits of "proper" PyTorch data loading, but also have a zillion bells and whistles, especially test-time augmentation that could improve accuracy.

--

That's a great suggestion, I'll add an item to the list... more specifically, though, the item is to do a performance test (which can be arbitrarily inelegant) to see what the benefit would be, with and without a GPU, and make sure results are identical. If the benefit is more than around a 25% speedup, it's probably worth it. If it's less than that, it may be preferable to keep the current approach, which is easier to debug and maintain, and keeps a much longer shared code path across PyTorch and TF. Also I vaguely remember that images in a batch need to be the same size, which isn't guaranteed, so either the test would need to verify that this isn't the case, or the implementation would need to break batches when the image size changes.

Nidhi703 commented 10 months ago

Hey I found the topic very interesting and useful and would like to contribute if allowed or atleast give it a proper try but this would be my first open source contribution and I would really need some guidance so is there someone I could talk to about it and maybe work on certain easy tasks to get better at things?

zhmiao commented 8 months ago

Hello @Nidhi703, we are very sorry for the late reply. We totally missed your reply. Would you like to let us know which part of the list you want to contribute to? Thank you very much!

arky commented 8 months ago

@agentmorris @zhmiao Perhaps there is value taking some of these ideas and filing them as individual issues. I believe that would provide good contributor pathways for new community people to join the project.

@Nidhi703 Please consider joining the discord channel if you haven't already!