Closed tomsauv closed 1 year ago
Hi, you need to provide a file with ground truth text each line in the same order of input reads. eg -
yeast
yeast
e.coli
e.coli
human
mouse
Hello, I think I don't understand the meaning of "ground truth". By that do you mean a file telling the taxonomy for every read in the fasta file? How may I get that? Using Kraken2, perhaps?
Hi,
Ground truth parameter is only for evaluating the tool. It should not be used unless that’s the motive. This was implanted for parameter tuning and evaluation purposes.
I hope this clears things up.
Cheers Anuradha
Hi, thank you very much for quick reply, and yep, that cleared things up but now I have another question: How do you get a result like the one you show in the README (see below)? I don't find any file looking like that result table in my lrb folder. Which tool should I use?
Example in README: """ Bin-0(4) Bin-1(1) Bin-2(3) Bin-3(0) Bin-4(5) Bin-5(2) Bin-6(6) Bin-7_(7) CP002618.1_Lactobacillus_paracasei_strain_BD-II 744 973 92 70169 215 0 0 0 NC_011658.1_Bacillus_cereus_AH187 110 11 277 10 30129 0 1 439 CP002807.1_Chlamydia_psittaci_08DC60_chromosome 0 0 9 0 1 0 0 11014 NC_012883.1_Thermococcus_sibiricus_MM_739 74 0 32441 1 28 2 0 8 """
...and so on
That’s the result for test dataset in readme. The zip file should have the ground truth for you to run the eval.oh script.
Ok, I see, but then, my point is: how did you get the ids.txt in the test dataset? Which tool should we use?
You can use any tool to classify reads. Minimap2 with references or a tool like kraken2.
OK, thank you very much!
Hi, I binned my reads. Could you provide some information on the format needed for the 'ground truth' file to provide to the eval.py script?
"--truth', '-t', help="Path to text file with grounds truth"
Thank you very much