bioinfo-ut / PlasmidSeeker

A k-mer based program for the identification of known plasmids from whole-genome sequencing reads
BSD 3-Clause "New" or "Revised" License
35 stars 11 forks source link

Plasmid present in database? #3

Closed SK-N-BE closed 6 years ago

SK-N-BE commented 6 years ago

Hello, I am looking for the copy number of the plasmid pHNSHP45 in E.coli isolates. The plasmid should be present. However, it was not identified by plasmidseeker. Therefore, I was wondering whether this plasmid is included in your database. How may I check whether a plasmid is present in the database or not?

Greetings SK-N-BE

SK-N-BE commented 6 years ago

I checked the presence of the plasmid using BRIG. In fact, the plasmid is not present in isolate 2. However, it is present in isolate 1

phnshp45_brig

bioinfo-ut commented 6 years ago

You can check the names of plasmids included in the database if you go under the database directory and open the text file "names.txt".

SK-N-BE commented 6 years ago

Hei, thank you! The plasmid is in fact not included into the database. How may I include the plasmid into the database?

SK-N-BE commented 6 years ago

Hello, I need your help.

I now builded my own database. I selected two plasmids that I found with plasmidseeker using your database (as a kind of control) . In addition, I also added the sequence of the plasmid pHNSHP45 to the database.

I got results- when using the default values (kmer 20 and so on), plasmidseeker.pl was able to identify the two plasmids that I found before. However, it was not able to identify the pHNSHP45 plasmid. But i know, that the plasmid is present in one of my samples (see BRIG figure).

Therefore, I maked a new database, but this time, I reduced the k-mer length to 10. Now, the plasmid was found with 91,27% identity.

BUT, I got these error messages and do not know what exactly it means:

Warnmeldung: In testiPlasmiidi2(bac[, 1], bac[, 2], plasmid[, 1], plasmid[, 2], : Vale bakter v6i koondumisprobleem???? tmp_list_cov_10_intrsec.list tmp_list_cov_10_intrsec.list Total size 178258 Finding intersection Size 89129 Sorting Done Printing distribution of tmp_plasmid.txt Initial median 426 tmp_final_10_0_diff1.list tmp_final_10_0_diff1.list Total size 163646 Finding intersection Size 81823 Sorting Done FINAL median 392 Testing plasmid... Warnmeldung: In testiPlasmiidi2(bac[, 1], bac[, 2], plasmid[, 1], plasmid[, 2], : Vale bakter v6i koondumisprobleem???? Argument ""Tehniline" isn't numeric in sprintf at plasmidseeker.pl line 413. Argument ""Tehniline" isn't numeric in sprintf at plasmidseeker.pl line 413. Argument ""Tehniline" isn't numeric in sprintf at plasmidseeker.pl line 413.

SK-N-BE commented 6 years ago

Hei again, I did a mistake. I was using that isolate for which BRIG indicated that the plasmid is not present. In the other isolate, the above named plasmid is present with 77% identity (when using the default kmer) that might be correct. As you can see in the BRIG figure, there are some small regions that are not present.

Thus, using a kmer length of only 10, gives a false positive result.

However, is it possible to add the plasmid to the main database from you? Or do I have to build a complete new database?