flatironinstitute / spikeforest

Spike sorting benchmarking system
Apache License 2.0
24 stars 6 forks source link

Requesting URI for synth mearec neuronexus recordings, and confused on how to obtain true firings #33

Open jacobpennington opened 1 year ago

jacobpennington commented 1 year ago

Hello, I have a couple of questions:

1) What URI can I use, if available, to download the synth mearec neuronexus recordings through the spikeforest API?

2) How do I download the true firings for a given recording? I can see the sha1 URI associated with each one on the SpikeForest website, but it's not clear how to actually download that file. I tried using the kachery_cloud API based on some other code in the repository as follows:

import kachery_cloud as kcl
# Used URI for true firings of first recording in paired_english
x = kcl.load_json('sha1://924a22045736c5a8b45d85c5ced42d42d308e699/firings_true.json')

But this assigns x a value of None.

Thanks, Jacob.

magland commented 1 year ago

Hi @jacobpennington, thanks for the questions.

To load the true firings for any spikeforest recording, follow this example: https://github.com/flatironinstitute/spikeforest

Regarding the synth mearec neuronexus recordings, that needs to be prepared for this new system. I am trying to track down the scripts to do that.

jacobpennington commented 1 year ago

Hello, If possible, I'd like to request "synth_bionet_static" be added as well. Alternatively / in addition, can you point me to an example of how to download the raw .mda files? I tried to follow this script in the spikeforest_recordings repository, but it looks like it diverged from the kachery package: https://github.com/flatironinstitute/spikeforest_recordings/blob/master/scripts/prepare_recordings.py

magland commented 1 year ago

To your question about how to load the ground truth, see this example: https://github.com/flatironinstitute/spikeforest/blob/main/examples/load_extractors_for_recording.py

Unfortunately the simulated datasets are not available with the new system, and I am having a hard time tracking down the scripts the could be used to make them available...

jacobpennington commented 1 year ago

Thanks - I was able to load ground truth, but I'm still not sure how to get the data for the simulated datasets (not with the new system, just how to get them at all in any form).

magland commented 1 year ago

I do have the files, but at this point it's sort of a big project to make them available. I guess it depends on how much you want the data... would you be willing to embark with me on a project to package up the data into a DANDI dataset?

jacobpennington commented 1 year ago

I don't really need the entire dataset, this is just to do a test run with the new version of Kilosort to compare to some other simulated data. Would just one or two recordings (the raw traces and the probe information) be doable without needing to repackage everything?

magland commented 1 year ago

I don't really need the entire dataset, this is just to do a test run with the new version of Kilosort to compare to some other simulated data. Would just one or two recordings (the raw traces and the probe information) be doable without needing to repackage everything?

I can do this manually for individual recordings, but you'll need to do some work on your end.

So in this study

http://spikeforest.flatironinstitute.org/study/synth_mearec_neuronexus_noise10_K10_C32

I looked at the first recording. It has a directory of:

sha1dir://30aa1acfedd28e527555908a739d7e61e4f741f0.patched

The latest kachery does not support the sha1dir system, but we can get the index file (which I made available)

kachery-cloud-cat sha1://30aa1acfedd28e527555908a739d7e61e4f741f0
{"files": {"raw.mda": {"size": 2304000020, "sha1": "2cd6bdd2e11c33ccfe2414baaca98474c466a5ee"}, "geom.csv": {"size": 1626, "sha1": "d8abb67f29853b8b45035f3456cc1642ae7df306"}, "firings_true.mda": {"size": 731996, "sha1": "6edd58610db6b88a3104fc313b6626c3f22ab944"}, "params.json": {"size": 41, "sha1": "089943a2a3bbde3a809a6a41f8ca6e06d7828e86"}, "firings_true_gt1.mat": {"size": 938150, "sha1": "525578f0d3302d336e709e6f43f8da2b13d58e9d"}, "geom.prb": {"size": 359, "sha1": "32dbdb7480cd494a852e0036d5d172ae0db2b3d1"}}, "dirs": {}}

Then I made the raw.mda available which can be retrieved from sha1://2cd6bdd2e11c33ccfe2414baaca98474c466a5ee. You can use kcl.load_file(...).

You can see the other files in the above... I made raw.mda, geom.csv, firings_true.mda, and params.json available in a similar way.

Let me know if you are able to load that, or if you need additional help. If it works, I can prepare others in a similar way.

jacobpennington commented 1 year ago

Thanks for doing this. This seems to be mostly working, except I can't load 'geom.csv' (raw and params worked fine). kcl.load_file(<sha for geom>) returns None for that one.

magland commented 1 year ago

Thanks for doing this. This seems to be mostly working, except I can't load 'geom.csv' (raw and params worked fine). kcl.load_file(<sha for geom>) returns None for that one.

Okay try again now.

jacobpennington commented 1 year ago

Yep that works, thanks!