tanghaibao / jcvi

Python library to facilitate genome assembly, annotation, and comparative genomics
BSD 2-Clause "Simplified" License
757 stars 186 forks source link

Using this workflow for a new genome #402

Closed ardy20 closed 2 years ago

ardy20 commented 3 years ago

Dear Sir How can I use this workflow for a newly assembled species which is not listed in Phytozome?

For example, for Jojoba genome that we assembled and annotated what should be placed in the following code?

python -m jcvi.compara.catalog ortholog grape peach --no_strip_names

Regards

tanghaibao commented 3 years ago

@ardy20

You need to follow the wiki once to understand the inputs to the pipeline: https://github.com/tanghaibao/jcvi/wiki/MCscan-(Python-version)

For any species (whether from Phytozome, or your own), you want to start from two files:

Therefore for your Jojoba genome, you'll need to start from those two files as well. In the wiki tutorial, take a look at what is downloaded from Phytozome (FASTA and GFF3), for example, for grape and peach, to kickstart the pipeline.

ardy20 commented 3 years ago

Thanks Tang for your quick respond.

I installed and tested the tool through both conda and also pip install on a High Performance Computer.

I followed the wiki and prepared the cds and also gff3 as you suggested.

But when I run the following command I get the Error. Is it related to python version? How to fix this? I get empty files after this error.

python -m jcvi.compara.catalog ortholog jojoba jojoba --no_strip_names

[15:29:35] ERROR Set text.usetex=False. Font styles may be inconsistent. base.py:403 DEBUG lastdb jojoba jojoba.cds base.py:1184 /bin/bash: lastdb: command not found DEBUG lastal -u 0 -P 24 -i3G -f BlastTab jojoba jojoba.cds >./jojoba.jojoba.last base.py:1184 /bin/bash: lastal: command not found DEBUG Assuming --qbed=jojoba.bed --sbed=jojoba.bed synteny.py:499 DEBUG Looks like self-self comparison. synteny.py:512 DEBUG Load file jojoba.bed base.py:34 [15:29:36] DEBUG Load file jojoba.bed base.py:34 DEBUG Load BLAST file jojoba.jojoba.last.P98L0.inverse (total 0 lines) blastfilter.py:47 DEBUG Load file jojoba.jojoba.last.P98L0.inverse base.py:34 DEBUG running the cscore filter (cscore>=0.70) .. blastfilter.py:108 DEBUG after filter (0->0) .. blastfilter.py:111 DEBUG running the local dups filter (tandem_Nmax=10) .. blastfilter.py:116 DEBUG after filter (0->0) .. blastfilter.py:156 DEBUG Assuming --qbed=jojoba.bed --sbed=jojoba.bed synteny.py:499 DEBUG Looks like self-self comparison. synteny.py:512 DEBUG Load file jojoba.bed base.py:34 DEBUG Load file jojoba.bed base.py:34 [15:29:37] DEBUG Load file jojoba.jojoba.last.P98L0.inverse.filtered base.py:34 DEBUG A total of 0 BLAST imported from jojoba.jojoba.last.P98L0.inverse.filtered. synteny.py:387 DEBUG Chaining distance = 20 synteny.py:1931 DEBUG Load file jojoba.jojoba.anchors base.py:34 DEBUG A total of 0 anchor was found. Aborted. synteny.py:1480 Traceback (most recent call last): File "/scratch/project/qaafi-cnafs/jcvi/lib/python3.6/runpy.py", line 193, in _run_module_as_main "main", mod_spec) File "/scratch/project/qaafi-cnafs/jcvi/lib/python3.6/runpy.py", line 85, in _run_code exec(code, run_globals) File "/scratch/project/qaafi-cnafs/jcvi/lib/python3.6/site-packages/jcvi/compara/catalog.py", line 947, in main() File "/scratch/project/qaafi-cnafs/jcvi/lib/python3.6/site-packages/jcvi/compara/catalog.py", line 78, in main p.dispatch(globals()) File "/scratch/project/qaafi-cnafs/jcvi/lib/python3.6/site-packages/jcvi/apps/base.py", line 110, in dispatch globalsaction File "/scratch/project/qaafi-cnafs/jcvi/lib/python3.6/site-packages/jcvi/compara/catalog.py", line 715, in ortholog scan(dargs) File "/scratch/project/qaafi-cnafs/jcvi/lib/python3.6/site-packages/jcvi/compara/synteny.py", line 1948, in scan summary([anchor_file]) File "/scratch/project/qaafi-cnafs/jcvi/lib/python3.6/site-packages/jcvi/compara/synteny.py", line 1481, in summary raise ValueError("A total of 0 anchor was found. Aborted.") ValueError: A total of 0 anchor was found. Aborted.

tanghaibao commented 3 years ago

@ardy20

/bin/bash: lastdb: command not found

We are missing LAST. Try building it: https://gitlab.com/mcfrith/last until you have lastdb and lastal on your path.

ardy20 commented 3 years ago

Hi I installed LAST in my conda environment and place lastl and lastdb (in bin) to the $PATH but I still see the error and the jojoba.jojoba.anchor is empty.

Please help to fix this.

WARNING 40888-Ss.00g123690.m01.CDS not in jojoba.bed blastfilter.py:65 WARNING 40888-Ss.00g251500.m01.CDS not in jojoba.bed blastfilter.py:65 WARNING 40888-Ss.00g162210.m01.CDS not in jojoba.bed blastfilter.py:65 WARNING 40888-Ss.00g162220.m01.CDS not in jojoba.bed blastfilter.py:65 WARNING 40888-Ss.00g389630.m01.CDS not in jojoba.bed blastfilter.py:65 WARNING 40888-Ss.00g389690.m01.CDS not in jojoba.bed blastfilter.py:65 WARNING 40888-Ss.00g011100.m01.CDS not in jojoba.bed blastfilter.py:65 WARNING 40888-Ss.00g082590.m01.CDS not in jojoba.bed blastfilter.py:65 WARNING 40888-Ss.00g214460.m01.CDS not in jojoba.bed blastfilter.py:65 WARNING 40888-Ss.00g214600.m01.CDS not in jojoba.bed blastfilter.py:65 WARNING too many warnings.. suppressed blastfilter.py:67 [18:58:37] DEBUG running the cscore filter (cscore>=0.70) .. blastfilter.py:108 DEBUG after filter (0->0) .. blastfilter.py:111 DEBUG running the local dups filter (tandem_Nmax=10) .. blastfilter.py:116 DEBUG after filter (0->0) .. blastfilter.py:156 DEBUG Assuming --qbed=jojoba.bed --sbed=jojoba.bed synteny.py:499 DEBUG Looks like self-self comparison. synteny.py:512 DEBUG Load file jojoba.bed base.py:34 DEBUG Load file jojoba.bed base.py:34 [18:58:38] DEBUG Load file jojoba.jojoba.last.P98L0.inverse.filtered base.py:34 DEBUG A total of 0 BLAST imported from jojoba.jojoba.last.P98L0.inverse.filtered. synteny.py:387 DEBUG Chaining distance = 20 synteny.py:1931 DEBUG Load file jojoba.jojoba.anchors base.py:34 DEBUG A total of 0 anchor was found. Aborted. synteny.py:1480 Traceback (most recent call last): File "/scratch/project/qaafi-cnafs/jcvi/lib/python3.6/runpy.py", line 193, in _run_module_as_main "main", mod_spec) File "/scratch/project/qaafi-cnafs/jcvi/lib/python3.6/runpy.py", line 85, in _run_code exec(code, run_globals) File "/scratch/project/qaafi-cnafs/jcvi/lib/python3.6/site-packages/jcvi/compara/catalog.py", line 947, in main() File "/scratch/project/qaafi-cnafs/jcvi/lib/python3.6/site-packages/jcvi/compara/catalog.py", line 78, in main p.dispatch(globals()) File "/scratch/project/qaafi-cnafs/jcvi/lib/python3.6/site-packages/jcvi/apps/base.py", line 110, in dispatch globalsaction File "/scratch/project/qaafi-cnafs/jcvi/lib/python3.6/site-packages/jcvi/compara/catalog.py", line 715, in ortholog scan(dargs) File "/scratch/project/qaafi-cnafs/jcvi/lib/python3.6/site-packages/jcvi/compara/synteny.py", line 1948, in scan summary([anchor_file]) File "/scratch/project/qaafi-cnafs/jcvi/lib/python3.6/site-packages/jcvi/compara/synteny.py", line 1481, in summary raise ValueError("A total of 0 anchor was found. Aborted.") ValueError: A total of 0 anchor was found. Aborted.

tanghaibao commented 3 years ago

@ardy20

See the warnings on top. The IDs do not match.

Make sure your FASTA sequence names match the names in the 3rd column of the BED file.

ardy20 commented 3 years ago

Dear Haibao

Great! It worked! Thanks! Could you please guide me how to interpret Pairwise synteny visualization graphs. I do not understand where are three horizontal and vertical blocks of the example that you provided in your document. I am so new and any guide including a useful paper or tutorial would be appreciated.

On the other hand, How can I get whole genome duplication, duplicated and tandem repeats etc.

Regards Ardy

ardy20 commented 3 years ago

Hello again

The pdf file could not be opened following the command below:

(base) uqakhara@flashlite1:.../project/qaafi-cnafs/jcvi> python -m jcvi.graphics.dotplot jojoba.jojoba.anchors [11:36:29] ERROR Fall back to Python implementation of BlastLine blast.py:28 [11:36:32] ERROR Set text.usetex=False. Font styles may be inconsistent. base.py:403 DEBUG Assuming --qbed=jojoba.bed --sbed=jojoba.bed synteny.py:499 DEBUG Looks like self-self comparison. synteny.py:512 DEBUG Load file jojoba.bed base.py:34 DEBUG Load file jojoba.bed base.py:34 [11:36:33] DEBUG xsize=41050 ysize=41050 dotplot.py:338 WARNING findfont: Font family ['sans-serif'] not found. Falling back to DejaVu Sans. font_manager.py:1269 [11:36:33 AM] DEBUG Dot plot title: Intra-genomic comparison within jojoba (1,187 gene pairs) dotplot.py:381 ERROR savefig failed. Reset usetex to False. base.py:287 missing font metrics file: phvr7t Traceback (most recent call last): File "/scratch/project_mnt/S0030/jcvi/jcvi/graphics/base.py", line 283, in savefig plt.savefig(figname, dpi=dpi, format=format) File "/sw/RCC/Anaconda/2020.02/lib/python3.7/site-packages/matplotlib/pyplot.py", line 723, in savefig res = fig.savefig(*args, kwargs) File "/sw/RCC/Anaconda/2020.02/lib/python3.7/site-packages/matplotlib/figure.py", line 2203, in savefig self.canvas.print_figure(fname, kwargs) File "/sw/RCC/Anaconda/2020.02/lib/python3.7/site-packages/matplotlib/backend_bases.py", line 2126, in print_figure kwargs) File "/sw/RCC/Anaconda/2020.02/lib/python3.7/site-packages/matplotlib/backends/backend_pdf.py", line 2548, in print_pdf self.figure.draw(renderer) File "/sw/RCC/Anaconda/2020.02/lib/python3.7/site-packages/matplotlib/artist.py", line 38, in draw_wrapper return draw(artist, renderer, *args, *kwargs) File "/sw/RCC/Anaconda/2020.02/lib/python3.7/site-packages/matplotlib/figure.py", line 1736, in draw renderer, self, artists, self.suppressComposite) File "/sw/RCC/Anaconda/2020.02/lib/python3.7/site-packages/matplotlib/image.py", line 137, in _draw_list_compositing_images a.draw(renderer) File "/sw/RCC/Anaconda/2020.02/lib/python3.7/site-packages/matplotlib/artist.py", line 38, in draw_wrapper return draw(artist, renderer, args, kwargs) File "/sw/RCC/Anaconda/2020.02/lib/python3.7/site-packages/matplotlib/axes/_base.py", line 2630, in draw mimage._draw_list_compositing_images(renderer, self, artists) File "/sw/RCC/Anaconda/2020.02/lib/python3.7/site-packages/matplotlib/image.py", line 137, in _draw_list_compositing_images a.draw(renderer) File "/sw/RCC/Anaconda/2020.02/lib/python3.7/site-packages/matplotlib/artist.py", line 38, in draw_wrapper return draw(artist, renderer, *args, *kwargs) File "/sw/RCC/Anaconda/2020.02/lib/python3.7/site-packages/matplotlib/text.py", line 685, in draw bbox, info, descent = textobj._get_layout(renderer) File "/sw/RCC/Anaconda/2020.02/lib/python3.7/site-packages/matplotlib/text.py", line 293, in _get_layout ismath="TeX" if self.get_usetex() else False) File "/sw/RCC/Anaconda/2020.02/lib/python3.7/site-packages/matplotlib/backends/_backend_pdf_ps.py", line 47, in get_text_width_height_descent s, fontsize, renderer=self) File "/sw/RCC/Anaconda/2020.02/lib/python3.7/site-packages/matplotlib/texmanager.py", line 460, in get_text_width_height_descent page, = dvi File "/sw/RCC/Anaconda/2020.02/lib/python3.7/site-packages/matplotlib/dviread.py", line 243, in iter while self._read(): File "/sw/RCC/Anaconda/2020.02/lib/python3.7/site-packages/matplotlib/dviread.py", line 319, in _read self._dtable[byte](self, byte) File "/sw/RCC/Anaconda/2020.02/lib/python3.7/site-packages/matplotlib/dviread.py", line 161, in wrapper return method(self, [f(self, byte-min) for f in get_args]) File "/sw/RCC/Anaconda/2020.02/lib/python3.7/site-packages/matplotlib/dviread.py", line 462, in _fnt_def self._fnt_def_real(k, c, s, d, a, l) File "/sw/RCC/Anaconda/2020.02/lib/python3.7/site-packages/matplotlib/dviread.py", line 469, in _fnt_def_real raise FileNotFoundError("missing font metrics file: %s" % fontname) FileNotFoundError: missing font metrics file: phvr7t

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "/sw/RCC/Anaconda/2020.02/lib/python3.7/runpy.py", line 193, in _run_module_as_main "main", mod_spec) File "/sw/RCC/Anaconda/2020.02/lib/python3.7/runpy.py", line 85, in _run_code exec(code, run_globals) File "/scratch/project_mnt/S0030/jcvi/jcvi/graphics/dotplot.py", line 537, in dotplot_main(sys.argv[1:]) File "/scratch/project_mnt/S0030/jcvi/jcvi/graphics/dotplot.py", line 532, in dotplot_main savefig(image_name, dpi=iopts.dpi, iopts=iopts) File "/scratch/project_mnt/S0030/jcvi/jcvi/graphics/base.py", line 289, in savefig plt.savefig(figname, dpi=dpi) File "/sw/RCC/Anaconda/2020.02/lib/python3.7/site-packages/matplotlib/pyplot.py", line 723, in savefig res = fig.savefig(*args, kwargs) File "/sw/RCC/Anaconda/2020.02/lib/python3.7/site-packages/matplotlib/figure.py", line 2203, in savefig self.canvas.print_figure(fname, kwargs) File "/sw/RCC/Anaconda/2020.02/lib/python3.7/site-packages/matplotlib/backend_bases.py", line 2126, in print_figure kwargs) File "/sw/RCC/Anaconda/2020.02/lib/python3.7/site-packages/matplotlib/backends/backend_pdf.py", line 2548, in print_pdf self.figure.draw(renderer) File "/sw/RCC/Anaconda/2020.02/lib/python3.7/site-packages/matplotlib/artist.py", line 38, in draw_wrapper return draw(artist, renderer, *args, *kwargs) File "/sw/RCC/Anaconda/2020.02/lib/python3.7/site-packages/matplotlib/figure.py", line 1736, in draw renderer, self, artists, self.suppressComposite) File "/sw/RCC/Anaconda/2020.02/lib/python3.7/site-packages/matplotlib/image.py", line 137, in _draw_list_compositing_images a.draw(renderer) File "/sw/RCC/Anaconda/2020.02/lib/python3.7/site-packages/matplotlib/artist.py", line 38, in draw_wrapper return draw(artist, renderer, args, kwargs) File "/sw/RCC/Anaconda/2020.02/lib/python3.7/site-packages/matplotlib/axes/_base.py", line 2630, in draw mimage._draw_list_compositing_images(renderer, self, artists) File "/sw/RCC/Anaconda/2020.02/lib/python3.7/site-packages/matplotlib/image.py", line 137, in _draw_list_compositing_images a.draw(renderer) File "/sw/RCC/Anaconda/2020.02/lib/python3.7/site-packages/matplotlib/artist.py", line 38, in draw_wrapper return draw(artist, renderer, *args, *kwargs) File "/sw/RCC/Anaconda/2020.02/lib/python3.7/site-packages/matplotlib/text.py", line 729, in draw mtext=mtext) File "/sw/RCC/Anaconda/2020.02/lib/python3.7/site-packages/matplotlib/backends/backend_pdf.py", line 2012, in draw_tex page, = dvi File "/sw/RCC/Anaconda/2020.02/lib/python3.7/site-packages/matplotlib/dviread.py", line 243, in iter while self._read(): File "/sw/RCC/Anaconda/2020.02/lib/python3.7/site-packages/matplotlib/dviread.py", line 319, in _read self._dtable[byte](self, byte) File "/sw/RCC/Anaconda/2020.02/lib/python3.7/site-packages/matplotlib/dviread.py", line 161, in wrapper return method(self, [f(self, byte-min) for f in get_args]) File "/sw/RCC/Anaconda/2020.02/lib/python3.7/site-packages/matplotlib/dviread.py", line 462, in _fnt_def self._fnt_def_real(k, c, s, d, a, l) File "/sw/RCC/Anaconda/2020.02/lib/python3.7/site-packages/matplotlib/dviread.py", line 469, in _fnt_def_real raise FileNotFoundError("missing font metrics file: %s" % fontname) FileNotFoundError: missing font metrics file: phvr7t

tanghaibao commented 3 years ago

@ardy20

You are missing the Latex font package. See:

https://github.com/tanghaibao/jcvi/issues/368

and the FAQ in the wiki:

https://github.com/tanghaibao/jcvi/wiki

If you can't install the font package, add an option --notex to your command, but the font will be slightly uglier.

Haibao

adejokejoseph commented 1 year ago

@ardy20

See the warnings on top. The IDs do not match.

Make sure your FASTA sequence names match the names in the 3rd column of the BED file.

i have a similar issue. when trying to search for pairwise synteny, i get this error attached. the id of .gff, .cds and .bed files is the same.

2023-03-10

how do i resolve this issue please?

thank you