Public-Health-Bioinformatics / cpo-pipeline

An analysis pipeline for the purpose of investigating Carbapenemase-Producing Organisms.
MIT License
1 stars 2 forks source link

Determine 'Expected species' automatically #4

Closed dfornika closed 5 years ago

dfornika commented 5 years ago

Our sample naming conventions include a short coded organism name. eg:

'Eco': 'Escherichia coli'
'Cfr': 'Citrobacter freundii'
'Kpn': 'Klebsiella pneumoniae'
...etc...

We may be able to make the 'Expected species' input parameter optional. We could attempt to parse the sample ID to determine the expected species, but only if an expected species isn't provided as an input parameter.

ddooley commented 5 years ago

Are you talking about BCCDC sample naming conventions? Suggests that an install of the program could have a lookup table suited to the institution running it.

dfornika commented 5 years ago

Yes, those are our internal naming conventions. That's a good suggestion.

dfornika commented 5 years ago

I've decided not to implement this because it's too specific to our naming convention and would be difficult to generalize well.

I've added an optional parameter that accepts an NCBI taxonomy ID for the 'expected organism'.