tjcreedy / metamate

Your metabarcoding friend! Filter erroneous and unwanted amplicons.
GNU General Public License v3.0
6 stars 1 forks source link

requesting more documentation for interpreting the _results.csv file (via R), -i parameter for metamate dump #4

Closed morien closed 2 months ago

morien commented 2 years ago

I'd like to request some additional documentation in the readme.

In your R analysis file, there's some columns that look useful for filtering, and for which documentation would be useful: for example the 'data', 'part', and 'stat' columns seem useful for filtering the results.

Second, I'm having trouble understanding from the existing documentation exactly what part of the results the -i parameter threshold operates on when using metamate dump. In your example you say an imaginary user has chosen 35 as the optimal threshold. Showing both how the imagined user arrived at that threshold, and how it would be applied to the data in _results.csv would be helpful. I am looking at my own _results.csv file and I'm not completely certain which column -i would apply to.

johnrandallia commented 2 months ago

The choice of resultindex depends a lot on the study setting. In some studies it might be fine to keep a few NUMTs so you could, for example, make your decision solely based on the accuracy score or on X percentage of vnaASV that survive. In a strict setting, I recommend to minimize the estimated nonauthentics, while maximizing the accuracy score.