larssnip / micropan

R package for microbial pangenomics
21 stars 0 forks source link

I am trying to get the distances by bDist() #8

Closed byeollee closed 3 years ago

byeollee commented 3 years ago

Hello, good morning.

I tried 82 genome, blastAllAll is finished but error is coming

This is the code.

blastpAllAll(file.path("faa", faa.files), out.folder = "blast2", job = 100, verbose = T) dst.tbl <- bDist(blast.files = file.path("blast2", list.files("blast2", pattern = "txt$"))) bDist: ...reading 82 self alignments... ...found BLAST results for 5154971 unique sequences... ...reading remaining alignments... Error in bDist(blast.files = file.path("blast2", list.files("blast2", : Self-alignment lacking for Query107042in blast.file/home/star/blast2/GID19_vs_GID10.txt

This error coming... I tried get GID19_vs_GID10.txt again but same error coming.

I got 3407 blast file

What is mean about self-alignment lacking?

Please help.. thank you :)

larssnip commented 3 years ago

It looks to me like a sequence (number 107042) has no self-alignment. You read all 82 self-alignment files, but I guess that in the self-alignment for GID19 or GID10 (whichever genome it belongs to) this sequence is never listed. I have seen this before, and I must say it is a strange behavior from BLAST.

If we search with some sequences against themselves with BLAST, we would expect all sequences should have something similar to themselves, since all sequence are in fact identical to themselves! In at least some cases this occured because something happened to the BLAST output, and the file was corrupted. Re-running the self-alignment solved it then.

To debug this, can you first delete the self-alignment files for GID10 (GID10_vs_GID10.txt) and GID19, and then try to re-run blastpAllAll?

LS

byeollee commented 3 years ago

oh thank you for your advice

I tried to download (install.package("micropan") again

and then, this results came

dst.tbl <- bDist(list.files("blast2", pattern = "txt$", full.names = T)) readBlastSelf: ...received 3403 blast-files... ...found 82 self-alignment files... ...returns 10062712 alignment results readBlastPairs: ...received 3403 blast-files... ...found 3321 alignment files who are NOT self-alignments... ...returns 390549297 alignment results bDist: ...found 369408955 alignments... ...where 490693 are self-alignments... **Error in bDist(list.files("blast2", pattern = "txt$", full.names = T)) : No self-alignment for sequences: GID68_seq5365,GID81_seq2266,GID57_seq2402,GID3_seq2241,GID82_seq2435,GID19_seq3803**

so i tried readBlastSelf and readBlastPair.

`> self.tbl <- readBlastSelf(file.path("blast2", list.files("blast2", pattern = "txt$"))) readBlastSelf: ...received 3403 blast-files... ...found 82 self-alignment files... ...returns 10062712 alignment results

pair.tbl <- readBlastPair(file.path("blast2", list.files("blast2", pattern = "txt$"))) readBlastPairs: ...received 3403 blast-files... ...found 3321 alignment files who are NOT self-alignments... ...returns 390549297 alignment results dst.tbl <- bDist(blast.tbl = bind_rows(self.tbl, pair.tbl)) bDist: ...found 369408955 alignments... ...where 490693 are self-alignments... Error in bDist(blast.tbl = bind_rows(self.tbl, pair.tbl)) : No self-alignment for sequences: GID68_seq5365,GID81_seq2266,GID57_seq2402,GID3_seq2241,GID82_seq2435,GID19_seq3803`

this error coming. what should i do?

thank you :)

byeollee commented 3 years ago

Thank you for attention, finally, I solved this problems

thank you so much and have a nice day