Closed ATCGCGCTC closed 6 years ago
Hi there,
Thank you very much for the interest in VirFinder.
The RefSeqs used for training and testing are accessible in the additional_file_Table2.xlsx under the directory "supplementary_data". The first column is for the accession numbers, using which we downloaded the corresponding genomes from NCBI. The discovered dates of RefSeqs can be found in the 3rd column. The RefSeqs were split into non-overlapping fragments and then used for training and testing. Hope that helps!
Best wishes, Jessie
Hey Jie and Nathan, Maybe I entirely missed it, in which case I am very sorry to bother you, but could you make the exact data subset from "RefSeq virus and prokaryotic genomes sequenced from before and after 1 January 2014 " that were used to train and test the model publicly available? I'd love to try to match your results! Thanks in advance.