MariaNattestad / Assemblytics

Assemblytics is a bioinformatics tool to detect and analyze structural variants from a genome assembly by comparing it to a reference genome.
http://assemblytics.com
MIT License
135 stars 28 forks source link

`TypeError: list indices must be integers or slices, not float` with uncompressed delta file #25

Closed 0xaf1f closed 4 years ago

0xaf1f commented 4 years ago

Since #24 looks to be related to passing compressed input, I tried using an uncompressed delta file and got a different error:

$ gunzip -k user_data/TvPSMK5FFq1RcNce2lzB/Escherichia_coli_MHAP_assembly.Assemblytics.unique_length_filtered_l10000.delta.gz
$ Assemblytics/scripts/Assemblytics user_data/TvPSMK5FFq1RcNce2lzB/Escherichia_coli_MHAP_assembly.Assemblytics.unique_length_filtered_l10000.delta out 10000 50 10000
Input delta file: user_data/TvPSMK5FFq1RcNce2lzB/Escherichia_coli_MHAP_assembly.Assemblytics.unique_length_filtered_l10000.delta
Output prefix: out
Unique anchor length: 10000
Minimum variant size to call: 50
Maximum variant size to call: 10000
Logging progress updates in out/progress.log
script path: Assemblytics/scripts
1. Filter delta file
Keeping fully unique alignments even if they are below the unique anchor length of 10000 bp
Use --unique-length X to set the unique anchor length requirement. Default is 10000, such that each alignment must have at least 10000 bp from the query that are not included in any other alignments.
header:
/seq/schatz/mnattest/Assemblytics/Ecoli/GCF_000005845.2_ASM584v2_genomic.fna /seq/schatz/mnattest/Assemblytics/Ecoli/GCF_000801205.1_ASM80120v1_genomic.fna
NUCMER
First read through the file: 0 seconds for 0 query-reference combinations
Filtering alignments of 1 queries
Traceback (most recent call last):
  File "Assemblytics/scripts/Assemblytics_uniq_anchor.py", line 355, in <module>
    main()
  File "Assemblytics/scripts/Assemblytics_uniq_anchor.py", line 352, in main
    args.func(args)
  File "Assemblytics/scripts/Assemblytics_uniq_anchor.py", line 96, in run
    alignments_to_keep[query] = summarize_planesweep(lines_by_query[query], unique_length_required = unique_length,keep_small_uniques=keep_small_uniques)
  File "Assemblytics/scripts/Assemblytics_uniq_anchor.py", line 307, in summarize_planesweep
    i = binary_search(query_min,sorted_unique_intervals_left,0,len(sorted_unique_intervals_left))
  File "Assemblytics/scripts/Assemblytics_uniq_anchor.py", line 335, in binary_search
    if query == numbers[mid]:
TypeError: list indices must be integers or slices, not float
$ echo $?
0

The non-zero exit code when the program fails is also a problem since workflow managers will think the program succeeded since 0 indicates success.

0xaf1f commented 4 years ago

both this and #24 were with using python 3.7

MariaNattestad commented 4 years ago

Fixed in https://github.com/MariaNattestad/Assemblytics/pull/26