tanghaibao / jcvi

Python library to facilitate genome assembly, annotation, and comparative genomics
BSD 2-Clause "Simplified" License
738 stars 187 forks source link

jcvi.graphics.dotplot ERROR #515

Closed francicco closed 1 year ago

francicco commented 1 year ago

Hi @tanghaibao ,

Sorry to bother you again, I'm getting a couple of errors here. One is related to jcvi.graphics.dotplot

[13:46:05] DEBUG    Load file `Hmel.Eisa.anchors`                                                                                                                                                                                                                                                                                                                 base.py:34
           DEBUG    Assuming --qbed=Hmel.bed --sbed=Eisa.bed                                                                                                                                                                                                                                                                                                  synteny.py:496
           DEBUG    Load file `Hmel.bed`                                                                                                                                                                                                                                                                                                                          base.py:34
           DEBUG    Load file `Eisa.bed`                                                                                                                                                                                                                                                                                                                          base.py:34
[13:46:07] DEBUG    A total of 8451 markers written to `Hmel.scaff.bed`.                                                                                                                                                                                                                                                                                     allmaps.py:1321
           DEBUG    Weights file written to `weights.txt`.                                                                                                                                                                                                                                                                                                   allmaps.py:1338
[13:46:10] DEBUG    Assuming --qbed=Hmel.bed --sbed=Eisa.bed                                                                                                                                                                                                                                                                                                  synteny.py:496
           DEBUG    Load file `Hmel.bed`                                                                                                                                                                                                                                                                                                                          base.py:34
           DEBUG    Load file `Eisa.bed`                                                                                                                                                                                                                                                                                                                          base.py:34
Traceback (most recent call last):
  File "/sw/lang/anaconda.3.8.8-2021.05/lib/python3.8/runpy.py", line 194, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/sw/lang/anaconda.3.8.8-2021.05/lib/python3.8/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/user/work/tk19812/software/jcvi/jcvi/graphics/dotplot.py", line 537, in <module>
    dotplot_main(sys.argv[1:])
  File "/user/work/tk19812/software/jcvi/jcvi/graphics/dotplot.py", line 507, in dotplot_main
    dotplot(
  File "/user/work/tk19812/software/jcvi/jcvi/graphics/dotplot.py", line 314, in dotplot
    nv = block_color or value
UnboundLocalError: local variable 'block_color' referenced before assignment

The other one is related to jcvi.graphics.karyotype. My seqids and layout files look like this:

Eisa1Z00,Eisa0200,Eisa0300,Eisa0400,Eisa0500,Eisa0600,Eisa0700,Eisa0800,Eisa0900,Eisa1000,Eisa1100,Eisa1200,Eisa1300,Eisa1400,Eisa1500,Eisa1600,Eisa1700,Eisa1800,Eisa1900,Eisa2000,Eisa2100,Eisa2200,Eisa2300,Eisa2400,Eisa2500,Eisa2600,Eisa2700,Eisa2800,Eisa2900,Eisa3000,Eisa3100
Hmel221001o,Hmel202001o,Hmel201001o,Hmel203001o,Hmel203002o,Hmel203003o,Hmel204001o,Hmel204002o,Hmel205001o,Hmel206001o,Hmel207001o,Hmel208001o,Hmel209001o,Hmel210001o,Hmel211001o,Hmel212001o,Hmel213001o,Hmel214001o,Hmel214002o,Hmel214003o,Hmel214004o,Hmel214005o,Hmel215002o,Hmel215003o,Hmel216002o,Hmel217001o,Hmel218002o,Hmel218003o,Hmel219001o,Hmel219002o,Hmel220001o,Hmel220003o
# y, xstart, xend, rotation, color, label, va,  bed
 .6,     .1,    .8,       0,      , Hmel, top, Hmel.bed
 .4,     .1,    .8,       0,      , Eisa, top, Eisa.bed
# edges
e, 0, 1, Eisa.Hmel.anchors.simple

The Eisa.Hmel.anchors.simple file has just one line, and this is the error:

[13:41:54] DEBUG    Load file `layout`                                                                                                                                                                                                                                                                                                                            base.py:34
           DEBUG    Load file `Hmel.bed`                                                                                                                                                                                                                                                                                                                          base.py:34
           DEBUG    Load file `Eisa.bed`                                                                                                                                                                                                                                                                                                                          base.py:34
           ERROR    Size of `Eisa1Z00` is empty. Please check                                                                                                                                                                                                                                                                                               karyotype.py:383
           ERROR    Size of `Eisa0200` is empty. Please check                                                                                                                                                                                                                                                                                               karyotype.py:383
           ERROR    Size of `Eisa0300` is empty. Please check                                                                                                                                                                                                                                                                                               karyotype.py:383
           ERROR    Size of `Eisa0400` is empty. Please check                                                                                                                                                                                                                                                                                               karyotype.py:383
           ERROR    Size of `Eisa0500` is empty. Please check                                                                                                                                                                                                                                                                                               karyotype.py:383
           ERROR    Size of `Eisa0600` is empty. Please check                                                                                                                                                                                                                                                                                               karyotype.py:383
           ERROR    Size of `Eisa0700` is empty. Please check                                                                                                                                                                                                                                                                                               karyotype.py:383
           ERROR    Size of `Eisa0800` is empty. Please check                                                                                                                                                                                                                                                                                               karyotype.py:383
           ERROR    Size of `Eisa0900` is empty. Please check                                                                                                                                                                                                                                                                                               karyotype.py:383
           ERROR    Size of `Eisa1000` is empty. Please check                                                                                                                                                                                                                                                                                               karyotype.py:383
           ERROR    Size of `Eisa1100` is empty. Please check                                                                                                                                                                                                                                                                                               karyotype.py:383
           ERROR    Size of `Eisa1200` is empty. Please check                                                                                                                                                                                                                                                                                               karyotype.py:383
           ERROR    Size of `Eisa1300` is empty. Please check                                                                                                                                                                                                                                                                                               karyotype.py:383
           ERROR    Size of `Eisa1400` is empty. Please check                                                                                                                                                                                                                                                                                               karyotype.py:383
           ERROR    Size of `Eisa1500` is empty. Please check                                                                                                                                                                                                                                                                                               karyotype.py:383
           ERROR    Size of `Eisa1600` is empty. Please check                                                                                                                                                                                                                                                                                               karyotype.py:383
           ERROR    Size of `Eisa1700` is empty. Please check                                                                                                                                                                                                                                                                                               karyotype.py:383
           ERROR    Size of `Eisa1800` is empty. Please check                                                                                                                                                                                                                                                                                               karyotype.py:383
           ERROR    Size of `Eisa1900` is empty. Please check                                                                                                                                                                                                                                                                                               karyotype.py:383
           ERROR    Size of `Eisa2000` is empty. Please check                                                                                                                                                                                                                                                                                               karyotype.py:383
           ERROR    Size of `Eisa2100` is empty. Please check                                                                                                                                                                                                                                                                                               karyotype.py:383
           ERROR    Size of `Eisa2200` is empty. Please check                                                                                                                                                                                                                                                                                               karyotype.py:383
           ERROR    Size of `Eisa2300` is empty. Please check                                                                                                                                                                                                                                                                                               karyotype.py:383
           ERROR    Size of `Eisa2400` is empty. Please check                                                                                                                                                                                                                                                                                               karyotype.py:383
           ERROR    Size of `Eisa2500` is empty. Please check                                                                                                                                                                                                                                                                                               karyotype.py:383
           ERROR    Size of `Eisa2600` is empty. Please check                                                                                                                                                                                                                                                                                               karyotype.py:383
           ERROR    Size of `Eisa2700` is empty. Please check                                                                                                                                                                                                                                                                                               karyotype.py:383
           ERROR    Size of `Eisa2800` is empty. Please check                                                                                                                                                                                                                                                                                               karyotype.py:383
           ERROR    Size of `Eisa2900` is empty. Please check                                                                                                                                                                                                                                                                                               karyotype.py:383
           ERROR    Size of `Eisa3000` is empty. Please check                                                                                                                                                                                                                                                                                               karyotype.py:383
           ERROR    Size of `Eisa3100` is empty. Please check                                                                                                                                                                                                                                                                                               karyotype.py:383
           ERROR    Size of `Hmel221001o` is empty. Please check                                                                                                                                                                                                                                                                                            karyotype.py:383
           ERROR    Size of `Hmel202001o` is empty. Please check                                                                                                                                                                                                                                                                                            karyotype.py:383
           ERROR    Size of `Hmel201001o` is empty. Please check                                                                                                                                                                                                                                                                                            karyotype.py:383
           ERROR    Size of `Hmel203001o` is empty. Please check                                                                                                                                                                                                                                                                                            karyotype.py:383
           ERROR    Size of `Hmel203002o` is empty. Please check                                                                                                                                                                                                                                                                                            karyotype.py:383
           ERROR    Size of `Hmel203003o` is empty. Please check                                                                                                                                                                                                                                                                                            karyotype.py:383
           ERROR    Size of `Hmel204001o` is empty. Please check                                                                                                                                                                                                                                                                                            karyotype.py:383
           ERROR    Size of `Hmel204002o` is empty. Please check                                                                                                                                                                                                                                                                                            karyotype.py:383
           ERROR    Size of `Hmel205001o` is empty. Please check                                                                                                                                                                                                                                                                                            karyotype.py:383
           ERROR    Size of `Hmel206001o` is empty. Please check                                                                                                                                                                                                                                                                                            karyotype.py:383
           ERROR    Size of `Hmel207001o` is empty. Please check                                                                                                                                                                                                                                                                                            karyotype.py:383
           ERROR    Size of `Hmel208001o` is empty. Please check                                                                                                                                                                                                                                                                                            karyotype.py:383
           ERROR    Size of `Hmel209001o` is empty. Please check                                                                                                                                                                                                                                                                                            karyotype.py:383
           ERROR    Size of `Hmel210001o` is empty. Please check                                                                                                                                                                                                                                                                                            karyotype.py:383
           ERROR    Size of `Hmel211001o` is empty. Please check                                                                                                                                                                                                                                                                                            karyotype.py:383
           ERROR    Size of `Hmel212001o` is empty. Please check                                                                                                                                                                                                                                                                                            karyotype.py:383
           ERROR    Size of `Hmel213001o` is empty. Please check                                                                                                                                                                                                                                                                                            karyotype.py:383
           ERROR    Size of `Hmel214001o` is empty. Please check                                                                                                                                                                                                                                                                                            karyotype.py:383
           ERROR    Size of `Hmel214002o` is empty. Please check                                                                                                                                                                                                                                                                                            karyotype.py:383
           ERROR    Size of `Hmel214003o` is empty. Please check                                                                                                                                                                                                                                                                                            karyotype.py:383
           ERROR    Size of `Hmel214004o` is empty. Please check                                                                                                                                                                                                                                                                                            karyotype.py:383
           ERROR    Size of `Hmel214005o` is empty. Please check                                                                                                                                                                                                                                                                                            karyotype.py:383
           ERROR    Size of `Hmel215002o` is empty. Please check                                                                                                                                                                                                                                                                                            karyotype.py:383
           ERROR    Size of `Hmel215003o` is empty. Please check                                                                                                                                                                                                                                                                                            karyotype.py:383
           ERROR    Size of `Hmel216002o` is empty. Please check                                                                                                                                                                                                                                                                                            karyotype.py:383
           ERROR    Size of `Hmel217001o` is empty. Please check                                                                                                                                                                                                                                                                                            karyotype.py:383
           ERROR    Size of `Hmel218002o` is empty. Please check                                                                                                                                                                                                                                                                                            karyotype.py:383
           ERROR    Size of `Hmel218003o` is empty. Please check                                                                                                                                                                                                                                                                                            karyotype.py:383
           ERROR    Size of `Hmel219001o` is empty. Please check                                                                                                                                                                                                                                                                                            karyotype.py:383
           ERROR    Size of `Hmel219002o` is empty. Please check                                                                                                                                                                                                                                                                                            karyotype.py:383
           ERROR    Size of `Hmel220001o` is empty. Please check                                                                                                                                                                                                                                                                                            karyotype.py:383
           ERROR    Size of `Hmel220003o` is empty. Please check                                                                                                                                                                                                                                                                                            karyotype.py:383
Traceback (most recent call last):
  File "/sw/lang/anaconda.3.8.8-2021.05/lib/python3.8/runpy.py", line 194, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/sw/lang/anaconda.3.8.8-2021.05/lib/python3.8/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/user/work/tk19812/software/jcvi/jcvi/graphics/karyotype.py", line 464, in <module>
    main()
  File "/user/work/tk19812/software/jcvi/jcvi/graphics/karyotype.py", line 444, in main
    Karyotype(
  File "/user/work/tk19812/software/jcvi/jcvi/graphics/karyotype.py", line 390, in __init__
    tr = Track(root, lo, gap=gap, height=height, lw=lw, draw=False)
  File "/user/work/tk19812/software/jcvi/jcvi/graphics/karyotype.py", line 174, in __init__
    ratio = span / total
ZeroDivisionError: float division by zero

What am I doing wrong? Thanks a lot F

tanghaibao commented 1 year ago

@francicco

There are multiple issues here possibly with input files.

The first error is caused by a malformed Hmel.Eisa.anchors. Usually, this file contains syntenic blocks separated by #, which is somehow missing, hence the error. Was it generated by JCVI package or custom-made? Please check.

The second error is caused by certain chromosomes missing from the BED file. For example, does Eisa1Z00 contain any genes in Eisa.bed? This is usually caused by a mismatch of chromosome names in the BED file.

francicco commented 1 year ago

Hi @tanghaibao

Yes, I generated Hmel.Eisa.anchors myself. It worked for jcvi.assembly.syntenypath bed, but probably not for the dotplot.

For the second error I think the problem is the fact that Eisa.Hmel.anchors.simple only contains one line...

tanghaibao commented 1 year ago

Hi @francicco

I added a change that might help with problem 1 https://github.com/tanghaibao/jcvi/commit/db439a4736f713b61fffbf8f9dd7f5a2b777b4af It may not completely fix the issue but it's worth a try (pip install -U jcvi) ...

Let me know how I can help on problem 2.

francicco commented 1 year ago

Thanks a lot!

What info do you need for the second problem? Do you want the input files? F

francicco commented 1 year ago

also, I think the update didn't work

[18:24:53] DEBUG    Assuming --qbed=Mpol.bed --sbed=Mmen.bed                                                                                                                                  synteny.py:496
           DEBUG    Load file `Mpol.bed`                                                                                                                                                          base.py:34
           DEBUG    Load file `Mmen.bed`                                                                                                                                                          base.py:34
Traceback (most recent call last):
  File "/sw/lang/anaconda.3.8.8-2021.05/lib/python3.8/runpy.py", line 194, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/sw/lang/anaconda.3.8.8-2021.05/lib/python3.8/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/user/work/tk19812/software/jcvi/jcvi/graphics/dotplot.py", line 537, in <module>
    dotplot_main(sys.argv[1:])
  File "/user/work/tk19812/software/jcvi/jcvi/graphics/dotplot.py", line 507, in dotplot_main
    dotplot(
  File "/user/work/tk19812/software/jcvi/jcvi/graphics/dotplot.py", line 314, in dotplot
    nv = block_color or value
UnboundLocalError: local variable 'block_color' referenced before assignment
francicco commented 1 year ago

Here the inputs in case you need it karyotypeinput.tar.gz F

tanghaibao commented 1 year ago

@francicco

Thanks for sending the input. Very helpful.

Problem 1 seems to be fixed by the commit that I mentioned. Can you verify that you have the latest:

python -c "import jcvi; print(jcvi.__version__)"
v1.2.16

If not, try pip install -U jcvi (-U is for upgrade).

Problem 2 is due to the wrong order in seqids file. Since track 0 is Hmel, track 1 is Eisa, the chromosomes in seqids needs to match that (the first line is Hmel and the second line is Eisa). When I flipped the two lines, the script ran without producing an error.

Even though this runs, however, it does not visualize the synteny blocks as you expected. There are many synteny blocks but we only have one line in the .simple file, and it does not really look like a valid block (start and end gene not on the same chromosome). So this doesn't quite work.

I would suggest to somehow try to generate a .anchors file, if you can, with the proper block separator (#), and then use python -m jcvi.compara.synteny simple to generate a proper .simple file, in order to be compatible with what jcvi expects.

francicco commented 1 year ago

Updated and now it works, I also managed to introduce ###, but now I'm getting this:

JCVI utility libraries v1.2.16 [Copyright (c) 2010-2022, Haibao Tang]
(venv) [tk19812@bp1-login01 consensus_gene_set]$ python -m jcvi.assembly.allmaps path
Traceback (most recent call last):
  File "/sw/lang/anaconda.3.8.8-2021.05/lib/python3.8/runpy.py", line 194, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/sw/lang/anaconda.3.8.8-2021.05/lib/python3.8/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/user/work/tk19812/software/jcvi/jcvi/assembly/allmaps.py", line 2035, in <module>
    main()
  File "/user/work/tk19812/software/jcvi/jcvi/assembly/allmaps.py", line 862, in main
    p.dispatch(globals())
  File "/user/work/tk19812/software/jcvi/jcvi/apps/base.py", line 111, in dispatch
    globals[action](sys.argv[2:])
  File "/user/work/tk19812/software/jcvi/jcvi/assembly/allmaps.py", line 1452, in path
    opts, args, iopts = p.set_image_options(args, figsize="10x6")
  File "/user/work/tk19812/software/jcvi/jcvi/apps/base.py", line 570, in set_image_options
    group.add_option(
  File "/sw/lang/anaconda.3.8.8-2021.05/lib/python3.8/optparse.py", line 1008, in add_option
    self._check_conflict(option)
  File "/sw/lang/anaconda.3.8.8-2021.05/lib/python3.8/optparse.py", line 980, in _check_conflict
    raise OptionConflictError(
optparse.OptionConflictError: option --seed: conflicting option string(s): --seed

Some conflict in the code I think... F

tanghaibao commented 1 year ago

@francicco

My bad. While I was adding an option for another user, I unintentionally introduced the conflict 🤦. You may need to upgrade jcvi again. Sorry for the inconvenience.

francicco commented 1 year ago

No problem at all!

Something about jcvi.graphics.karyotype, added synteny blocks with ###\t\t100, but I'm getting this: karyotype.pdf

is there still something about my data? Thanks a lot F

I

tanghaibao commented 1 year ago

This looks like it's generating synteny blocks all over the place. Looks a little strange to me. Perhaps the block boundaries were off. Make sure all the gene pairs within the same block are contained within the same cluster after ###. One block, then the next block ...

I wonder if you could somehow try going though the normal jcvi.compara.catalog ortholog process per the wiki and see they look? The dotplot seems quite nice to me, so I think if you can gather the inputs for jcvi.compara.catalog ortholog - it should work? Not sure you'd want to go through the troubles to generate the .anchors file by hand.

francicco commented 1 year ago

The idea was to check how in synteny the orthologs are. If I use jcvi.compara.catalog ortholog then those orthologs are the result of the reciprocal blast hit, instead of my way to find, which is done based on the position on the genome.

If I use the jcvi.compara.catalog ortholog process it clearly works...

Screenshot 2022-11-26 at 09 51 56

This is on a different dataset.

Probably I have to implement a better way to generate synteny blocks. How jcvi.compara.catalog ortholog does it? Thanks a lot F

tanghaibao commented 1 year ago

No problem. All that matters to this plot is the .simple file (which is derived from the .anchors file).

Perhaps take a look at the intermediate output on this different dataset and see if the manually generated .anchors file is written properly (specifically if the blocks are separately propertly since the gene pairs themselves look decent when you did dotplot ...).