tanghaibao / jcvi

Python library to facilitate genome assembly, annotation, and comparative genomics
BSD 2-Clause "Simplified" License
745 stars 187 forks source link

ALLMAPS: ValueError: max() arg is an empty sequence #119

Closed ckeeling closed 3 years ago

ckeeling commented 5 years ago

Although the test data worked fine, I am getting an error near the end of this command with my own data. In this case I have only one csv file merged into the bed, but I get the same if I have two csv files:

python -m jcvi.assembly.allmaps path Steven.bed JB_1-20120605-female.fa -w weights.txt
...
04:38:07 [agp] Target fasta written to `Steven.unplaced.fasta`.
04:38:07 [base] cat Steven.chr.agp Steven.unplaced.agp >Steven.agp
04:38:07 [base] cat Steven.chr.fasta Steven.unplaced.fasta >Steven.fasta
04:38:08 [base] Load file `Steven.agp`
04:38:11 [base] Load file `JB_1-20120605-female.fa.sizes`
04:38:12 [base] faSize -detailed /home/vagrant/Steven.fasta >Steven.fasta.sizes
04:38:14 [base] Load file `Steven.fasta.sizes`
04:38:17 [chain] File written to `Steven.chain`.
04:38:17 [base] liftOver -minMatch=1 Steven.bed Steven.chain Steven.lifted.bed unmapped
Reading liftover chains
04:39:52 [base] sort -k1,1 -k2,2n -k3,3n -k4,4 Steven.lifted.bed -o Steven.lifted.bed
04:39:52 [base] Load file `Steven.bed`
04:39:53 [allmaps] Map contains 1729 markers in 12 linkage groups.
04:39:53 [base] Load file `JB_1-20120605-female.fa.sizes`
04:39:58 [base] Load file `Steven.chr.agp`
04:39:58 [base] Load file `Steven.chr.agp`
04:39:58 [base] Load file `Steven.lifted.bed`
04:39:58 [allmaps] Map contains 0 markers in 0 linkage groups.
04:39:58 [base] Load file `weights.txt`
04:39:58 [base] Imported 1 records from `weights.txt`.
Traceback (most recent call last):
  File "/usr/lib/python2.7/runpy.py", line 174, in _run_module_as_main
    "__main__", fname, loader, pkg_name)
  File "/usr/lib/python2.7/runpy.py", line 72, in _run_code
    exec code in run_globals
  File "/usr/local/lib/python2.7/dist-packages/jcvi/assembly/allmaps.py", line 1871, in <module>
    main()
  File "/usr/local/lib/python2.7/dist-packages/jcvi/assembly/allmaps.py", line 827, in main
    p.dispatch(globals())
  File "/usr/local/lib/python2.7/dist-packages/jcvi/apps/base.py", line 96, in dispatch
    globals[action](sys.argv[2:])
  File "/usr/local/lib/python2.7/dist-packages/jcvi/assembly/allmaps.py", line 1525, in path
    "--figsize={0}".format(opts.figsize)])
  File "/usr/local/lib/python2.7/dist-packages/jcvi/assembly/allmaps.py", line 1867, in plotall
    plot(xargs + [seqid])
  File "/usr/local/lib/python2.7/dist-packages/jcvi/assembly/allmaps.py", line 1699, in plot
    weights = Weights(weightsfile, mapnames)
  File "/usr/local/lib/python2.7/dist-packages/jcvi/assembly/allmaps.py", line 625, in __init__
    pivot_weight, o, pivot = self.get_pivot(mapnames)
  File "/usr/local/lib/python2.7/dist-packages/jcvi/assembly/allmaps.py", line 646, in get_pivot
    for m, w in self.items() if m in mapnames)
ValueError: max() arg is an empty sequence
>

Most of the files are generated, but not the plots.
The files unmapped and Steven.lifted.bed are made but are empty. If I run the following command alone, it gets "killed":

>liftOver -minMatch=1 Steven.bed Steven.chain Steven.lifted.bed unmapped
Reading liftover chains
Killed
>

This was run with 8GB of RAM. When I repeated the python allmaps command with 128GB available, it finished properly. Thus, it seems that liftOver is being killed due to insufficient memory, and this failure is not being noticed or properly handled by allmaps. Maybe it could check if the lifted.bed file contains 0 markers in 0 linkage groups, and display an error.

Otherwise, awesome!

tanghaibao commented 5 years ago

@ckeeling

Thanks for reporting the failure - I'll try to add code to handle this error. Although the best place is maybe when the liftOver failed (or after "Map contains 0 markers in 0 linkage groups").

However, it is a little strange that liftOver uses this much RAM? Did you have a biggish .chain file or .bed file?

Haibao

ckeeling commented 5 years ago

Hello @tanghaibao

The .chain was 20MB and the .bed was 90KB. I am running ALLMAPS within a Singularity image. In the first case is was run in an environment with 8GB, and the second with 128GB; I didn't determine the minimum that was needed.

I see now an earlier still-open issue by @wolverineGT has a similar output.

Cheers, Chris

jinshangkun commented 3 years ago

Hi I also encountered the same problem, how can I solve it? Thanks!

mrmrwinter commented 2 years ago

Hi, I've just encountered this problem when trying to plot a macrosynteny figure in MCScan.

SOLVED: Spaces in the seqids file. Removed them and it worked