mani2012 / PathoStat

The purpose of this package is to perform Statistical Analysis on the PathoScope generated reports files.
8 stars 9 forks source link

Load reports #14

Closed mlbendall closed 7 years ago

mlbendall commented 7 years ago

Here is a function for loading all data from PathoID reports. Not sure if you've done this already but this is what I've been using for other projects. Use it as-is or just copy a few snippets.

mani2012 commented 7 years ago

We already have a function that gets the top 10 genomes across all the samples from the PathoScope reports.

mlbendall commented 7 years ago

No problem. This is just a function I've been using in the past. It returns a list of matrices with all variables estimated in the PathoID report, plus the total read counts and total genome counts. Thought it might be useful for pulling in this other data.

mani2012 commented 7 years ago

We need to then write another function to get the top 10 genomes from this list in that case. I think, we shall worry about this later and focus on other modules particularly the ones suggested by Marcos.

mlbendall commented 7 years ago

As written it doesn't appear to be getting the top 10 genomes (across all samples) it is just reading the first 10 lines from each PathoID report. I'm not sure we want to make that assumption, or at least make this configurable by the user.

mani2012 commented 7 years ago

The PathoScope reports are already sorted and hence the top 10 lines do correspond to the top 10 genomes. We just thought 10 is a good enough number, although I agree that should be user configurable.