Closed t-b closed 5 years ago
cc @sgratiy. @t-b is going to take a stab at this, questions will probably be heading our way.
Where can I get some NWBv1 sample files for testing with run_pipeline.py
?
NwbReader
to read NWB v2 files using pynwbcreate_nwb_reader
to use this new class for v2 files, use the same logic as depicted in https://github.com/AllenInstitute/IPNWB/commit/6d8d504cee29b83c8307f7f2ceb1262e49e4a559 to check for NWB v2 filesAibsForeignDataset
Progress can be tracked at https://github.com/t-b/ipfx/tree/add-AibsForeignDataset.
I have sample files. What would be the preferred way for sharing them with you?
Either dropbox (brain grant account) or slack or I can give you FTP credentials.
Could you please provide details for how to use "brain grant account" drop box?
Done. Pinged you on slack.
Perhaps a better idea is to used public files e.g. use AllenSDK to download a cell https://ipfx.readthedocs.io/en/latest/auto_examples/qc_examples/sweep_qc.html#sphx-glr-auto-examples-qc-examples-sweep-qc-py
Thanks I'll give that a shot.
@sgratiy @dyf
In https://github.com/AllenInstitute/ipfx/blob/659378e0d5ae28ee0b7f8d6fe2ba124e2d603b7b/ipfx/aibs_data_set.py#L85-L90 the entry stimulus_code_ext
is added to the returned dict which holds the stimulus code with the "Set Sweep Count" (aka the sweep number of a stimset with multiple sweeps).
Looking into the repo with
$ git grep stimulus_code_ext
ipfx/aibs_data_set.py: sweep_record["stimulus_code_ext"] = stim_code_ext
tests/test_ephys_data_set.py: 'stimulus_code_ext': ['C1LSFINEST150112[0]','EXTPBREAKN141203[0]','EXTPBREAKN141203[0]'],
it looks like that this is not used.
Is that info needed? Or was it just easy to query and therefore added? I can add that as well to my child class of EphysDataSet, as this should be in the ABF/DAT files.
@sgratiy @dyf
If I'm doing python ipfx/bin/run_pipeline.py --input_nwb_file 595570553.nwb --output_nwb output.nwb
on the master branch I'm getting
WARNING:root:many=True not supported from argparse
Traceback (most recent call last):
File "ipfx/bin/run_pipeline.py", line 101, in <module>
if __name__ == "__main__": main()
File "ipfx/bin/run_pipeline.py", line 86, in main
module = ags.ArgSchemaParser(schema_type=PipelineParameters)
File "C:\Anaconda2\lib\site-packages\argschema\argschema_parser.py", line 175, in __init__
result = self.load_schema_with_defaults(self.schema, args)
File "C:\Anaconda2\lib\site-packages\argschema\argschema_parser.py", line 274, in load_schema_with_defaults
result = utils.load(schema, args)
File "C:\Anaconda2\lib\site-packages\argschema\utils.py", line 422, in load
raise mm.ValidationError(errors)
marshmallow.exceptions.ValidationError: {'qc_criteria': [u'Missing data for required field.']}
Any idea what I'm doing wrong?
re: stimulus_code_ext -- it's not used anywhere. It was originally a requirement in a several-years-ago version of the pipeline, so we kept it there for backwards compatibility. It's not necessary for your files.
re: qc_criteria
-- there is a json file containing the qc criteria:
https://github.com/AllenInstitute/ipfx/blob/master/ipfx/qc_criteria.json
This constant knows the path:
https://github.com/AllenInstitute/ipfx/blob/master/ipfx/qc_protocol.py#L6
Would you mind updating run_pipeline.py
to load those QC criteria by default if they are not supplied? You can make it optional here:
https://github.com/AllenInstitute/ipfx/blob/master/ipfx/bin/run_pipeline.py#L93
And then load the DEFAULT_QC_CRITERIA_FILE if None
here:
https://github.com/AllenInstitute/ipfx/blob/master/ipfx/bin/run_qc.py#L33
You would get the default qc_criteria when running scripts for generating the pipeline input. To generate the pipeline input json: https://github.com/AllenInstitute/ipfx/blob/659378e0d5ae28ee0b7f8d6fe2ba124e2d603b7b/ipfx/bin/generate_pipeline_input.py#L120 and then
python run_pipeline_extraction.py --input_json INPUT_JSON --output_json OUTPUT_JSON
Or there is a script that generates the pipeline input and runs the pipeline from the nwb file: https://github.com/AllenInstitute/ipfx/blob/master/ipfx/bin/pipeline_from_nwb_file.py:
python pipeline_from_nwb_file.py INPUT_NWB_FILE Make sure to change the output cell directory in the script https://github.com/AllenInstitute/ipfx/blob/659378e0d5ae28ee0b7f8d6fe2ba124e2d603b7b/ipfx/bin/pipeline_from_nwb_file.py#L8
Finally, there is a shell script doing the same thing: pipeline_from_nwb.sh
@sgratiy Regarding #124. When I call the pipeline_from_nwb_file.py
script I get
$ python pipeline_from_nwb_file.py ../../tests/595570553.nwb
Traceback (most recent call last):
File "pipeline_from_nwb_file.py", line 5, in <module>
import generate_pipeline_input as gpi
File "E:\projekte\allensdk.ipfx\ipfx\bin\generate_pipeline_input.py", line 6, in <module>
from generate_se_input import generate_se_input
File "E:\projekte\allensdk.ipfx\ipfx\bin\generate_se_input.py", line 5, in <module>
import make_stimulus_ontology as mso
File "E:\projekte\allensdk.ipfx\ipfx\bin\make_stimulus_ontology.py", line 4, in <module>
import lims_queries as lq
File "E:\projekte\allensdk.ipfx\ipfx\bin\lims_queries.py", line 2, in <module>
import allensdk.internal.core.lims_utilities as lu
ImportError: No module named internal.core.lims_utilities
but I have all packages from requirements.txt
installed. Any ideas what I'm doing wrong?
This happens because we call make_stimulus_ontology_from_lims() when generating pipeline input. That function uses an internal package. This package was supposed to be used only internally and thus not listed in the requirements. Could you install allensdk_internal for now. We need a better solution for this going forward, I agree.
Inherit from EphysDataset, AibsDataset is an example which reads MIES data
Add tests a la test_ephys_data_set.py
[x] EphysDataSet should be NWb agnostic
[x] Move
Sweep
instance creation into NWB readerStick with python 2.7 for now.
The full pipeline can be run with run_pipeline.py.