Open rezaeir opened 2 years ago
@rezaeir Can you provide one HSV-1 genome to us? First, we found the number of HSV-1 is insufficient in current virus sketch version and would like to update it. Second, the -g option was designed for bacteria but can be revised for virus. Finally, we are considering pulling more genomes from NCBI virus instead of RefSeq. Would be great if you can provide one for revising and testing.
@ythuang0522 Unfortunately the number of HSV-1 genomes that are in NCBI is very limited. This https://www.ncbi.nlm.nih.gov/nuccore/NC_001806.2?report=fasta is refSeq file that you can access. Also I attached the file that I generated using ViPR containing more than 80 full genomes of HSV- HSV1_ViPR_DB.zip 1.
I used that ViPR database and it seems that it could marginally decrease the number of gaps (from 126 to 117).
Thanks for your response. I mean if you can provide the the viral genome after Medaka polishing for developing. I was not aware of ViPR. It looks to me you may use the -l (local database) for polishing. However it's still lacking the ANI selection step. We are considering adding this into the local DB version. As we don't have ONT viral genome at hand, would be better if you could provide one.
Sorry for the very late response. The following file is my sequencing with Minion R9.4 of HSV genome with a GFP insertion in its Tk gene locus. RR-tkHSV.raven.medaka.zip
I have the same problem. I like to to polish HSV-1, HSV-2, VZV, KSHV and HCMV-assemblies. I started with HSV-2, and it directly failed. Same issue...
@rezaeir We have polished the virus by -l (local database) with your HSV DB and tested some thresholds. Mismatch and insdel are accessed by fastmer (compare pre-polish and after-polish files). homopolished_1 is the default result which equals yours and we are curious how you got the gap. We would appreciate it if you could provide a reference of the virus for us to adjust our program. | mismatch | insertion | deletion | |
---|---|---|---|---|
homopolished_1 | 0 | 75 | 65 | |
homopolished_2 | 0 | 66 | 62 | |
homopolished_3 | 0 | 72 | 73 |
@steinbrl Hi, you can use -l (local database) to polish if you have the virus database. If the program can't find the closer virus in our database, it would skip it because of the insufficient homogeneous virus. It would be great if you can provide your assemblies and database(if you have one).
@rezaeir We have polished the virus by -l (local database) with your HSV DB and tested some thresholds. Mismatch and insdel are accessed by fastmer (compare pre-polish and after-polish files). homopolished_1 is the default result which equals yours and we are curious how you got the gap. We would appreciate it if you could provide a reference of the virus for us to adjust our program.
mismatch insertion deletion homopolished_1 0 75 65 homopolished_2 0 66 62 homopolished_3 0 72 73 hsv.zip
Hi, I've attached a reference fasta file from HSV-GFP which is an assembly from very high depth short read sequencing. I was wondering if you plan to add an internal virus database maybe based on NCBI virus? hsv1-gfp-genome.txt
I am trying to use homopolish to improve the assembly consensus after using Raven for assembly and Medaka for primary polishing. However, when I use the virus.msh file and input my consensus.fasta file as input, the output is that: ``
Is there any way that I can fix this? Also, I tried using the "-g" option with "humanalphaherpesvirinae_humanalphaherpesvirus1" as the genius_species input but I am not sure if this is the right way to write it (the result did not have any changes compared to Medaka's output).