ziyewang / MetaBinner

GNU General Public License v3.0
48 stars 6 forks source link

Got error: out of bounds for axis 0 while running Metabinner.py #2

Closed dawnmy closed 4 years ago

dawnmy commented 4 years ago
Traceback (most recent call last):
  File "/CAMI/docker/binners/MetaBinner/Metabinner.py", line 913, in <module>
    X_t, namelist, mapObj, X_cov_sr, X_com = gen_X(com_file, cov_file)
  File "/CAMI/docker/binners/MetaBinner/Metabinner.py", line 96, in gen_X
    compositMat = shuffled_compositMat[covIdxArr]
IndexError: index 94388951158496 is out of bounds for axis 0 with size 349831

BTW, there were only two coverage files produced by run.sh even I used the both short reads and long reads for computing coverage. But the example command line has two coverage files: coverage_sr_new.tsv, coverage_pb_new.tsv. Did I get something wrong?

ziyewang commented 4 years ago

This situation happens when there are some contigs in the coverage file, but not in the com_file file. Please filter the contigs. As for the following question, I didn't understand. Did you get two coverage files or one coverage file?

Best, Ziye

dawnmy commented 4 years ago

Sorry for the typo. There were two coverage files namely coverage_new.tsv, and coverage.tsv produced by run.sh when I used the both short reads and long reads for computing coverage. But the example command line has coverage files: coverage_sr_new.tsv, coverage_pb_new.tsv. Did I get something wrong?

dawnmy commented 4 years ago

The coverage_new.tsv contains both Pacbio and short reads but with different ID image. For ten samples, Pacbio starts from 11, while short reads starts from 1. Can I used this as input file for option --coverage_profiles?

ziyewang commented 4 years ago

You can use them as the input file for the option, but you should use "coverage_new_f1000.tsv" as the input if your com_file only contains the contigs longer than 1000bp. But I advise you to split the coverage file into the coverage file for long reads(coverage_pb) and the coverage for short reads (coverage_sr) as shown in "run.sh" and " split_coverage.py".

Best, Ziye

dawnmy commented 4 years ago

You can use them as the input file for the option, but you should use "coverage_new_f1000.tsv" as the input if your com_file only contains the contigs longer than 1000bp. But I advise you to split the coverage file into the coverage file for long reads(coverage_pb) and the coverage for short reads (coverage_sr) as shown in "run.sh" and " split_coverage.py".

Best, Ziye

Thank you for your explain. Just realized that the run.sh did not work properly because package Click was missing. This was why I was not able to find the coverage_sr_new.tsv, coverage_pb_new.tsv files. Maybe Click should also be added into the conda env YAMl file.

ziyewang commented 4 years ago

Thanks for the kind reminder.