Closed gouwens closed 3 years ago
Not sure what's going on with the failed test - that test (test_feature_vector_extraction History
) fails on my machine on the current master branch with a different error (below), so I can't run it to know if anything's changed.
________________________ test_feature_vector_extraction ________________________
tmpdir_factory = TempdirFactory(_tmppath_factory=TempPathFactory(_given_basetemp=None, _trace=<pluggy._tracing.TagTracerSub object at 0x7f41e9d1d0d0>, _basetemp=PosixPath('/tmp/pytest-of-nathang/pytest-1')))
def test_feature_vector_extraction(tmpdir_factory):
temp_output_dir = str(tmpdir_factory.mktemp("feature_vector"))
test_output_dir = TEST_OUTPUT_DIR
features = [
"first_ap_v",
"first_ap_dv",
"isi_shape",
"psth",
"inst_freq",
"spiking_width",
"spiking_peak_v",
"spiking_fast_trough_v",
"spiking_threshold_v",
"spiking_upstroke_downstroke_ratio",
"step_subthresh",
"subthresh_norm",
"subthresh_depol_norm",
]
run_feature_vector_extraction(ids=[500844783, 509604672],
output_dir=temp_output_dir,
data_source="filesystem",
output_code="TEMP",
project=None,
output_file_type="npy",
sweep_qc_option="none",
include_failed_cells=True,
run_parallel=False,
ap_window_length=0.003,
> file_list=test_nwb2_files
)
tests/test_run_feature_vector.py:50:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
output_dir = '/tmp/pytest-of-nathang/pytest-1/feature_vector0'
data_source = 'filesystem', output_code = 'TEMP', project = None
output_file_type = 'npy', sweep_qc_option = 'none', include_failed_cells = True
run_parallel = False, ap_window_length = 0.003, ids = [500844783, 509604672]
file_list = {500844783: '/local1/repos/ipfx/tests/data/Vip-IRES-Cre;Ai14(IVSCC)-226110.03.01.nwb', 509604672: '/local1/repos/ipfx/tests/data/Vip-IRES-Cre;Ai14(IVSCC)-236654.04.02.nwb'}
kwargs = {}, specimen_ids = [500844783, 509604672]
ontology = <ipfx.stimulus.StimulusOntology object at 0x7f41e6e158d0>
get_data_partial = functools.partial(<function data_for_specimen_id at 0x7f41e9cc8710>, sweep_qc_option='none', data_source='filesystem',...e;Ai14(IVSCC)-226110.03.01.nwb', 509604672: '/local1/repos/ipfx/tests/data/Vip-IRES-Cre;Ai14(IVSCC)-236654.04.02.nwb'})
results = <map object at 0x7f41e6e1c250>
def run_feature_vector_extraction(
output_dir,
data_source,
output_code,
project,
output_file_type,
sweep_qc_option,
include_failed_cells,
run_parallel,
ap_window_length,
ids=None,
file_list=None,
**kwargs
):
"""
Extract feature vector from a list of cells and save result to the output file(s)
Parameters
----------
output_dir : str
see CollectFeatureVectorParameters input schema for details
data_source : str
see CollectFeatureVectorParameters input schema for details
output_code: str
see CollectFeatureVectorParameters input schema for details
project : str
see CollectFeatureVectorParameters input schema for details
output_file_type : str
see CollectFeatureVectorParameters input schema for details
sweep_qc_option: str
see CollectFeatureVectorParameters input schema for details
include_failed_cells: bool
see CollectFeatureVectorParameters input schema for details
run_parallel: bool
see CollectFeatureVectorParameters input schema for details
ap_window_length: float
see CollectFeatureVectorParameters input schema for details
ids: int
ids associated to each cell.
file_list: list of str
nwbfile names
kwargs
Returns
-------
"""
if ids is not None:
specimen_ids = ids
elif data_source == "lims":
specimen_ids = lq.project_specimen_ids(project, passed_only=not include_failed_cells)
else:
logging.error("Must specify input file if data source is not LIMS")
if output_file_type == "h5":
# Check that we can access the specified file before processing everything
h5_file = h5py.File(os.path.join(output_dir, "fv_{}.h5".format(output_code)))
h5_file.close()
ontology = StimulusOntology(ju.read(StimulusOntology.DEFAULT_STIMULUS_ONTOLOGY_FILE))
logging.info("Number of specimens to process: {:d}".format(len(specimen_ids)))
get_data_partial = partial(data_for_specimen_id,
sweep_qc_option=sweep_qc_option,
data_source=data_source,
ontology=ontology,
ap_window_length=ap_window_length,
file_list=file_list)
if run_parallel:
pool = Pool()
results = pool.map(get_data_partial, specimen_ids)
else:
results = map(get_data_partial, specimen_ids)
> used_ids, results, error_set = su.filter_results(specimen_ids, results)
E TypeError: cannot unpack non-iterable NoneType object
ipfx/bin/run_feature_vector_extraction.py:308: TypeError
Figured out the issue on my machine was that I didn't really have the test files because I didn't have git-lfs installed and set up. The earlier failed test is because the reference test files exhibited the bug that's being fixed here (the feature vectors had fewer bins than they should have).
Thank you for contributing to the IPFX, your work and time will help to advance open science!
Overview:
Give a brief overview of the issue you are solving. Succinctly explain the GitHub issue you are addressing and the underlying problem of the ticket. The commit header and body should also include this message, for good commit messages see the contribution guidelines.
The feature vector extraction code sometimes produces vectors of different lengths that cannot be saved as an array in an H5 file. This happens because of differences in the floating point approximations of start and end times in different cells.
Addresses:
Add a link to the issue on Github board
Addresses issue #521
Type of Fix:
Solution:
Outline your solution to the previously described issue and underlying cause. This section should include a brief description of your proposed solution and how it addresses the cause of the ticket
Round the start and end values to the nearest millisecond before calculating the duration in spike-related feature vector calculations.
Changes:
Include a bulleted list or check box list of the implemented changes in brief, as well as the addition of supplementary materials (unit tests, integration tests, etc
psth_vector
,inst_freq_vector
, andspike_feature_vector
Validation:
Describe how you have validated that your solution addresses the root cause of the ticket. What have you done to ensure that your addition is bug free and works as expected? Please provide specific instructions so we can reproduce and list any relevant details about your configuration
Example script reproducing issue now runs to completion, and new unit tests pass.
Screenshots:
Unit Tests:
test_feature_vector.py
Script to reproduce error and fix:
script that demonstrates error:
Currently produces:
After fix produces: No message (runs to completion)
Configuration details:
Checklist
Notes:
Use this section to add anything you think worth mentioning to the reader of the issue