Closed helenamrusso closed 2 years ago
Hi @helenamrusso I had to install this dev branch of your to run Qemistree to accommodate the new sirius version. Qemistree worked fine in qiime as could obtain lots of data in the key output files:
3.4G -rw-r--r-- 1 flejzerowicz knightlab 3.5G Jul 31 10:14 fingerprints.qza
949M -rw-r--r-- 1 flejzerowicz knightlab 1.1G Jul 30 23:31 fragmentation_trees.qza
946M -rw-r--r-- 1 flejzerowicz knightlab 1.1G Jul 31 03:25 molecular_formulas.qza
But now, I have an issue with this command:
qiime qemistree make-hierarchy \
--i-csi-results /projects/nutrition/foodomics/qemistree/fingerprints.qza \
--i-feature-tables /projects/nutrition/foodomics/qemistree/FEATURE-BASED-MOLECULAR-NETWORKING-d0797f2a-download_qza_table_data-main.qza \
--o-tree /projects/nutrition/foodomics/qemistree/qemistree.qza \
--o-feature-table /projects/nutrition/foodomics/qemistree/feature-table-hashed.qza \
--o-feature-data /projects/nutrition/foodomics/qemistree/feature-data.qza
The error (below) is related to a temporary directory being deleted too early. Not sure if I should open an issue on this as it is baed on using this non-merged branch
Traceback (most recent call last):
File "/home/flejzerowicz/usr/miniconda3/envs/qiime2-2019.10/lib/python3.6/site-packages/q2cli/commands.py", line 328, in __call__
results = action(**arguments)
File "</home/flejzerowicz/usr/miniconda3/envs/qiime2-2019.10/lib/python3.6/site-packages/decorator.py:decorator-gen-217>", line 2, in make_hierarchy
File "/home/flejzerowicz/usr/miniconda3/envs/qiime2-2019.10/lib/python3.6/site-packages/qiime2/sdk/action.py", line 240, in bound_callable
output_types, provenance)
File "/home/flejzerowicz/usr/miniconda3/envs/qiime2-2019.10/lib/python3.6/site-packages/qiime2/sdk/action.py", line 383, in _callable_executor_
output_views = self._callable(**view_args)
File "/home/flejzerowicz/usr/miniconda3/envs/qiime2-2019.10/lib/python3.6/site-packages/q2_qemistree/_hierarchy.py", line 126, in make_hierarchy
qc_properties, metric)
File "/home/flejzerowicz/usr/miniconda3/envs/qiime2-2019.10/lib/python3.6/site-packages/q2_qemistree/_process_fingerprint.py", line 95, in process_csi_results
collated_fps = collate_fingerprint(csi_result, qc_properties, metric)
File "/home/flejzerowicz/usr/miniconda3/envs/qiime2-2019.10/lib/python3.6/site-packages/q2_qemistree/_process_fingerprint.py", line 45, in collate_fingerprint
index_col='relativeIndex', dtype=str, sep='\t')
File "/home/flejzerowicz/usr/miniconda3/envs/qiime2-2019.10/lib/python3.6/site-packages/pandas/io/parsers.py", line 685, in parser_f
return _read(filepath_or_buffer, kwds)
File "/home/flejzerowicz/usr/miniconda3/envs/qiime2-2019.10/lib/python3.6/site-packages/pandas/io/parsers.py", line 457, in _read
parser = TextFileReader(fp_or_buf, **kwds)
File "/home/flejzerowicz/usr/miniconda3/envs/qiime2-2019.10/lib/python3.6/site-packages/pandas/io/parsers.py", line 895, in __init__
self._make_engine(self.engine)
File "/home/flejzerowicz/usr/miniconda3/envs/qiime2-2019.10/lib/python3.6/site-packages/pandas/io/parsers.py", line 1135, in _make_engine
self._engine = CParserWrapper(self.f, **self.options)
File "/home/flejzerowicz/usr/miniconda3/envs/qiime2-2019.10/lib/python3.6/site-packages/pandas/io/parsers.py", line 1917, in __init__
self._reader = parsers.TextReader(src, **kwds)
File "pandas/_libs/parsers.pyx", line 382, in pandas._libs.parsers.TextReader.__cinit__
File "pandas/_libs/parsers.pyx", line 689, in pandas._libs.parsers.TextReader._setup_parser_source
FileNotFoundError: [Errno 2] File b'/panfs/panfs1.ucsd.edu/panscratch/flejzerowicz/dnn.qmstr_1536352/qiime2-archive-1vnkrtx5/493f64ca-ee65-4e90-9831-b19dcb729ea0/data/csi-output/fingerprints.csv' does not exist: b'/panfs/panfs1.ucsd.edu/panscratch/flejzerowicz/dnn.qmstr_1536352/qiime2-archive-1vnkrtx5/493f64ca-ee65-4e90-9831-b19dcb729ea0/data/csi-output/fingerprints.csv'
This is weird because temporary-files in panasas were used without issue for the previous qemistree steps. It seems that Qemistree created and deleted this fingerprints.csv temporary file before using it.
Thanks for any clue! Franck
Hi @FranckLejzerowicz Good to know that you got the output files, as expected! I never had this error you described, but it seems it is a small detail in the _process_fingerprint.py file. I just realized that I'm using an outdated version of pandas, and it was working fine for me. I talked with @anupriyatripathi and we fixed the issue, check the "fix pandas loc to reindex" modification and I hope it will work now for you!
Thanks! Helena
Hi @helenamrusso Somehow, my problem persist and I can't figure why. I have made a couple prints to check the content of the csi_results and see if the file said missing is indeed missing:
def collate_fingerprint(csi_result: CSIDirFmt, qc_properties: bool = False,
metric: str = 'euclidean'):
'''
This function collates predicted chemical fingerprints for mass-spec
features in an experiment.
'''
if isinstance(csi_result, CSIDirFmt):
csi_result = str(csi_result.get_path())
print(csi_result)
fpfoldrs = os.listdir(csi_result)
print(fpfoldrs)
and it shows:
/panfs/panfs1.ucsd.edu/panscratch/flejzerowicz/qiime2-archive-e216o7gj/493f64ca-ee65-4e90-9831-b19dcb729ea0/data/csi-output
['formula_identifications_adducts.tsv', 'canopus_summary_adducts.tsv', 'csi_fingerid.tsv', 'csi_fingerid_neg.tsv', 'formula_identifications.tsv', 'compound_identifications.tsv', 'compound_identifications_adducts.tsv', 'canopus_summary.tsv', 'report.mztab', '0_features_FEATURE_1233', '1_features_FEATURE_5513', '2_features_FEATURE_7501', [...etc...]
Hence, it indeed looks like the file that later is (attempted) read:
substructrs = pd.read_csv(os.path.join(csi_result, 'fingerprints.csv'),
index_col='relativeIndex', dtype=str, sep='\t')
fails because 'fingerprints.csv'
does not exist (it should be in the fpfoldrs
list printed above, right?)
That's weird because it seems like a valide file generated using qemistree:
$ qiime tools peek /projects/nutrition/foodomics/qemistree/fingerprints.qza
UUID: 493f64ca-ee65-4e90-9831-b19dcb729ea0
Type: CSIFolder
Data format: CSIDirFmt
Command run:
qiime qemistree make-hierarchy \
--i-csi-results /projects/nutrition/foodomics/qemistree/fingerprints.qza \
--i-feature-tables /projects/nutrition/foodomics/qemistree/FEATURE-BASED-MOLECULAR-NETWORKING-d0797f2a-download_qza_table_data-main.qza \
--o-tree /projects/nutrition/foodomics/qemistree/qemistree.qza \
--o-feature-table /projects/nutrition/foodomics/qemistree/feature-table-hashed.qza \
--o-feature-data /projects/nutrition/foodomics/qemistree/feature-data.qza
Code adapted to work with Sirius 4.8.2, with the new command-line interface (>4.4.29). It seems to be working fine as the outputs are being generated correctly (comparing to old datasets), but please double-check.