CCB-SB / plsdb

PLSDB pipeline to collect bacterial plasmids from NCBI
https://ccb-microbe.cs.uni-saarland.de/plsdb/
35 stars 4 forks source link

E coli plasmid sequences #4

Closed LeonardosMageiros closed 3 years ago

LeonardosMageiros commented 4 years ago

Hi,

I have a list of around 16,000 E coli genes and I would like to identify which of them are located in plasmids. I tried to use the web interface but the maximum numbers of entries in a fasta file is 10. Is there a way to execute my query without that restriction?

Alternatively, is it possible to batch download all the assembly sequences of plasmids that belong to E coli species in your database?

Thank you very much in advance for your time and help Best Leonardos

VGalata commented 4 years ago

Dear @LeonardosMageiros,

There is no option to increase the number of queries for the web server.

Therefore, I would recommend that you do your analysis locally by downloading the PLSDB data. Unfortunately, there is no option to filter the plasmids before the download. But, since plasmids can be exchanged between bacterial cells, it might be better not to limit your analysis to E. coli plasmids only. I would run the BLAST search using all plasmids, filter the hits by E-value, sequence identity and query coverage, and then examine the plasmids covered by the remaining hits.

I hope that helps. Let me know if you have any further questions!

Best, Valentina