tanghaibao / jcvi

Python library to facilitate genome assembly, annotation, and comparative genomics
BSD 2-Clause "Simplified" License
712 stars 185 forks source link

Short features from "extra features" BED missing in microsynteny plot #623

Closed wu116 closed 4 months ago

wu116 commented 5 months ago

Hi,

When I tried to draw a microsynteny plot with extra bed file, the element shorter than 187 bp can not be plotted.

Are there any parameter could control the minimum length?

#extra element could be plotted: length = 187
ZjChrB2_Chromosome6 11457126    11457313    TE_homo_88910   0   -

#extra elment could NOT be plotted: length = 186
ZjChrB2_Chromosome6 11457126    11457312    TE_homo_88910   0   -

Thank! W

tanghaibao commented 5 months ago

@wu116

Can you show the command that you used and the error you got? Thanks.

wu116 commented 5 months ago

Sorry for late reply.

The command I used:

python -m jcvi.graphics.synteny blocks Osa_Zjx4.bed blocks.layout --genelabelsize=10 --genelabels=Zj06G020380.mRNA1 --extra=heli.TE.bed

The blocks file:

Zj05G020500.mRNA1   Zj06G020510.mRNA1   Zj08G020510.mRNA1   Zj07G021530.mRNA1   rna-XM_015790756.2
.   .   .   .   rna-XM_015791295.2
.   .   Zj08G020490.mRNA1   Zj07G021510.mRNA1   rna-XM_015791300.2
Zj05G020480.mRNA1   Zj06G020490.mRNA1   .   .   rna-XM_015791301.2
r*Zj05G020410.mRNA1 Zj06G020380.mRNA1   Zj08G020470.mRNA1   Zj07G021490.mRNA1   rna-XM_015790971.2
r*Zj05G020330.mRNA1 Zj06G020300.mRNA1   Zj08G020470.mRNA1   Zj07G021490.mRNA1   rna-XM_015790971.2
r*Zj05G020440.mRNA1 Zj06G020410.mRNA1   Zj08G020440.mRNA1   Zj07G021450.mRNA1   rna-XM_015790971.2
r*Zj05G020370.mRNA1 Zj06G020340.mRNA1   Zj08G020440.mRNA1   Zj07G021450.mRNA1   rna-XM_015790971.2
Zj05G020290.mRNA1   Zj06G020290.mRNA1   .   .   rna-XM_015790372.2
Zj05G020270.mRNA1   .   Zj08G020430.mRNA1   Zj07G021430.mRNA1   rna-XM_015789127.2
.   .   Zj08G020420.mRNA1   Zj07G021420.mRNA1   rna-XM_015789126.2
.   .   .   .   rna-XM_015791902.2
Zj05G020250.mRNA1   Zj06G020250.mRNA1   .   .   rna-XM_015791901.2
Zj05G020230.mRNA1   Zj06G020230.mRNA1   Zj08G020400.mRNA1   Zj07G021390.mRNA1   rna-XM_015791481.2
.   .   .   .   rna-XM_015790940.2
.   .   .   .   rna-XM_015790731.2
Zj05G020230.mRNA1   Zj06G020230.mRNA1   Zj08G020400.mRNA1   Zj07G021390.mRNA1   rna-XM_015791549.2
.   .   .   .   rna-XM_026026855.1
.   .   .   .   rna-XM_015791545.2
Zj05G020230.mRNA1   Zj06G020230.mRNA1   Zj08G020400.mRNA1   Zj07G021390.mRNA1   rna-XM_015791543.2

The blocks.layout file:

# x,   y, rotation,     ha,     va, color, ratio,            label
0.5, 0.34,        0,   left, center,      ,     1.5, ZjChrB1_5
0.5, 0.18,        0,   left, center,      ,     1.5, ZjChrB2_6
0.5, 0.50,        0,   left, center,      ,     2, ZjChrA2_8
0.5, 0.66,        0,   left, center,      ,     2, ZjChrA1_7
0.5, 0.78,        0,   left, center,      ,     1, OsChr7
# edges
e, 1, 0
e, 0, 2
e, 2, 3
e, 3, 4

I will attach the Osa_Zjx4.bed and heli.TE.bed files with above files at the end as a zip file.

And the logs:

[16:29:46] INFO     `latex` not found. latex use is disabled.                                                                         base.py:609
           INFO     `lp` not found. latex use is disabled.                                                                            base.py:611
[16:29:47] INFO     Set text.usetex=False. Font styles may be inconsistent.                                                           base.py:442
           DEBUG    Load file `Osa_Zjx4.bed`                                                                                           base.py:34
[16:29:50] DEBUG    Load file `blocks`                                                                                                base.py:34
           DEBUG    Load file `blocks.layout`                                                                                         base.py:34
           DEBUG    Load file `heli.TE.bed`                                                                                            base.py:34
Column 0: Zj05G020230.mRNA1 - Zj05G020500.mRNA1 (ZjChrB1_Chromosome5:11529880-11665394)
  ZjChrB1_Chromosome5 .. 28 (12) features .. +
Extracted 3 features (2 after pruning)
Column 1: Zj06G020230.mRNA1 - Zj06G020510.mRNA1 (ZjChrB2_Chromosome6:11327091-11513610)
  ZjChrB2_Chromosome6 .. 29 (11) features .. +
Extracted 5 features (2 after pruning)
Column 2: Zj08G020400.mRNA1 - Zj08G020510.mRNA1 (ZjChrA2_Chromosome8:11958870-12041619)
  ZjChrA2_Chromosome8 .. 12 (11) features .. +
Extracted 3 features (3 after pruning)
Column 3: Zj07G021390.mRNA1 - Zj07G021530.mRNA1 (ZjChrA1_Chromosome7:12198030-12271115)
  ZjChrA1_Chromosome7 .. 15 (11) features .. +
Extracted 1 features (1 after pruning)
Column 4: rna-XM_015790756.2 - rna-XM_015791543.2 (Osa_NC_029262.1:2801643-3069686)
  Osa_NC_029262.1 .. 17 (20) features .. -
Extracted 0 features (0 after pruning)
           DEBUG    Matplotlib backend is: agg                                                                                        base.py:318
           DEBUG    Attempting save as: blocks2.pdf                                                                                   base.py:319
           WARNING  findfont: Generic family 'sans-serif' not found because none of the following families were found:       font_manager.py:1333
                    Helvetica    #(repeated 58 line)
           DEBUG    Figure saved to `blocks.pdf` (2400px x 2100px)                                                                   base.py:33

For example, there should be a 124 bp TE (TE_homo_88910) closed to the terminal of the gene Zj06G020380.mRNA1 according to the two bed files. But this TE has not been plotted with above command, while other TEs with length longer than 187 bp have been successfully plotted and no fatal error was thrown out.

I manually increased the length of all short TEs to 187 bp as a temporary solution and it worked. Maybe you can help me figure out where the real problem lies.

Thanks!! W

Attached file: files.zip

tanghaibao commented 4 months ago

@wu116

Thanks for sending the input files. By default, features smaller than a certain size will not be plotted. In code:

https://github.com/tanghaibao/jcvi/blob/34535f99d8dd7b24d8f6ae4a70a7d6bcf4bf96c1/jcvi/graphics/synteny.py#L469-L473

I may be able to create an option for you to disable this behavior though.

Adamtaranto commented 4 months ago

@tanghaibao I have added a --noprune feature in my dev branch. Should be able to copy code from there.

https://github.com/Adamtaranto/jcvi/blob/ad0c3d286e6838799bd0c9c0ebef1cdf73613439/jcvi/graphics/ribbon.py#L534

tanghaibao commented 4 months ago

@Adamtaranto

Fantastic. Would you mind a small PR? I'll pull this right in.

Adamtaranto commented 4 months ago

Sure, I'll do it on a fresh branch so it's just that edit.