Open JohnTigue opened 4 years ago
The cells in the challenge dataset are simply pre-released brain-map.org cells. The Allen Institute is constantly adding more cells.
This project has taken the raw challenge dataset and built specimens_manifest.json as a stand-in for the AllenSDK cell cache manifest JSON. But just as easily, data can be pulled down from brain-map.org or a Wasabi-hosted prerelease. Once the cells have been release, it is only natural to access them through brain-map.org.
So, tons of training data for researchers. And a challenge can be issued at any time by simply selecting ten cells from the brain-map.org data that are without public SWCs.
See also:
60 for a related, earlier issues
42 (dup?)
Currently The Allen Institute images more cells than it reconstructs. Yet as is on brain-map.org there are hundreds of cells with semi-automatically generated skeletons that have passed human QC review. That's tons more labeled training data.
Also, this and #60 are all the more evidence that adding a mechanism to allensdk that pulls down image stacks (via their RMA interface, but wrapped in friendly Python) is a valuable PR to send back upstream.
Bonus, how to get brightfield stacks into NWB:N 2.0, or at least CZI's JSON.