Closed vascokarla closed 1 month ago
Hi @vascokarla
Thank you for the information you provided. Using el_gato 1.19, I generated a call for two (SRR10080716 and SRR10177472) of the three isolates you specified as 'MD-,' also, please be aware that SRA run numbers are generally SRR/ERR/DRR. The SRS numbers you listed as SRA are not accession numbers that can be downloadable using the sra toolkit.
Sample (SRS) | Run (SRR) |
---|---|
SRS5357598 | SRR10080716 |
SRS5431100 | SRR10177472 |
SRS5832736 | SRR10698366 |
After viewing the reads in IGV for the isolate SRR10698366 with an 'MD-' call, you have reads that map across the entire neuA region for SRR10698366. However, in the middle of the area, you have fewer reads. We will get back to you as there are a few more things to investigate regarding whether this should be a new neuA reference.
Hi @vascokarla
I do not think SRR10698366 is a new neuA reference. There are a few reasons for this:
1) Our old in silico SBT tool generated a full ST call of ST1 2) The assemblied genome from NCBI (GCA_015963385.1_PDT000646143.1_genomic.fna) also produced a full ST call of ST1 3) Changing the default depth in el_gato (-d 5) generated a full ST call of ST1 4) When the depth is at default (-d 10), the run.log indicates that one position in neuA can't be resolved with a depth of 10 5) Pairwise alignment using the reference neuA allele and the one generated with a depth of 5 showed no differences
Hi @jennahamlin,
Thank you so much for thoroughly checking the details and providing such a detailed response. I understand now that an update to the neuA reference is not needed, based on your investigation and the results you shared. I really appreciate the time and effort you’ve put into this!
Also, I apologize for the confusion with the SRS numbers instead of the SRR run numbers in my previous message. To clarify, in the SRA, SRR refers to the run, while SRS refers to the sample. I’ll be more mindful of that distinction moving forward.
Thanks again for your support
After the recent update to the database, which included the
neuA_215
reference, we observed improvements in the identification of the neuA_neuAH locus in some of our Legionella pneumophila samples. This allowed for the subsequent Sequence-Based Typing (SBT) of these samples, which was not possible before the update. However, we have identified two cases where theneuA_neuAH
locus is still missing, and we suspect that this may be due to the need for a closer reference in the database.Additionally, we encountered a case where the
mip
gene was not identified by el_gato. We suspect this issue could also be related to the need for a closer reference in the database.We are uploading a CSV file that contains details about the samples, including their public NCBI identifiers, results of SBT analysis with both the old and new database, and sequencing quality checks obtained from PHoeNIx.
Thank you so much for your updates and support!
sbt_check_allele_reference.csv