Open ISU-rfitch opened 8 months ago
I think I found the problem. The blank and the samples were run under different programs. Same time parameters, but the blank uses 3 microscans and the samples had 1 microscan, which triples the number of total scans in the chromatogram. Rerunning without the blank. Will advise on success/failure.
Unfortunately, this did not fix the problem completely. This time it ran for much less time but still gave a similar error.
Traceback (most recent call last): File "/data/ccms-gnps/tools/mshub-gc/release_30/proc/io/importmsdata.py", line 446, in mzml_reader X = np.array(sp.centroidedPeaks).astype(float) File "/data/ccms-gnps/tools/miniconda3_gamma/envs/mshub-gc/lib/python3.7/site-packages/pymzml/spec.py", line 1636, in centroidedPeaks return self.peaks("centroided") File "/data/ccms-gnps/tools/miniconda3_gamma/envs/mshub-gc/lib/python3.7/site-packages/pymzml/spec.py", line 1031, in peaks arr = np.stack((mz, i), axis=-1) File "<__array_function__ internals>", line 6, in stack File "/data/ccms-gnps/tools/miniconda3_gamma/envs/mshub-gc/lib/python3.7/site-packages/numpy/core/shape_base.py", line 425, in stack raise ValueError('all input arrays must have the same shape') ValueError: all input arrays must have the same shape
Not sure which may be the offending file. I will run through the batch to see if I can spot another file with troubles.
On rechecking the blank file, the one I included with the set was under the same parameters, so removing it should have had no effect, so I'm not sure why the job reran so short and did not have the long list of array issues. Because of AGC, all of the files have a slightly different number of total scans but all are around 3800. However, other Thermo MS instruments such as orbitraps use AGC, so this should not be the problem. Again, any suggestions would be welcome.
Describe the bug MSHub processing job fails 166 files, Thermo iTQ ion trap EI data converted from raw to mzML with ProteoWizard MSconvert with peak picking, vendor algorithm All samples run under same conditions, column program etc.
Persistent error is "all input arrays must have the same shape" I wonder if it has something to do with an inconsistent number of scans due to AGC?
Excerpt: raise ValueError('all input arrays must have the same shape') ValueError: all input arrays must have the same shape Traceback (most recent call last): File "/data/ccms-gnps/tools/mshub-gc/release_30/proc/io/importmsdata.py", line 446, in mzml_reader X = np.array(sp.centroidedPeaks).astype(float) File "/data/ccms-gnps/tools/miniconda3_gamma/envs/mshub-gc/lib/python3.7/site-packages/pymzml/spec.py", line 1636, in centroidedPeaks return self.peaks("centroided") File "/data/ccms-gnps/tools/miniconda3_gamma/envs/mshub-gc/lib/python3.7/site-packages/pymzml/spec.py", line 1031, in peaks arr = np.stack((mz, i), axis=-1) File "<__array_function__ internals>", line 6, in stack File "/data/ccms-gnps/tools/miniconda3_gamma/envs/mshub-gc/lib/python3.7/site-packages/numpy/core/shape_base.py", line 425, in stack
Error message repeats for each subsequent file until system gives up at 87 files, then does not read further.
all input arrays must have the same shape
and so on...
May be a newbie issue, first time using GNPS. All help appreciated. Many thanks, Rick Fitch