jodyphelan / tbdb

Standard database for the TBProfiler tool
GNU Lesser General Public License v3.0
28 stars 18 forks source link

IDs not in tbdb.confidence.csv #25

Closed arturotorreso closed 4 years ago

arturotorreso commented 4 years ago

Dear Jody,

I was wondering why some of the IDs from the list of mutations I get using get_genome_positions.py are not listed in the tbdb.confidence.csv file.

I attach a file with IDs that are in the get_genome_positions.py output but not in the tbdb.confidence.csv file.

Thanks, Arturo

tbprof_missingIDs.txt

jodyphelan commented 4 years ago

Hi @arturotorreso,

The tbdb.confidence.csv file is calculated using WGS data + DST data (~16,000 isolate). If variants present in tbdb.csv are not found in the confidence file then it is due to them not being found in the WGS data I used. Most likely that they have been found in experimental work but have not appeared in clinical sequenced isolates yet.

Jody

arturotorreso commented 4 years ago

Hi Jody,

Great, that's solves it then! It makes total sense.

Regards, Arturo