WGLab / VirTect

Detection of viruses from RNA-Seq on human samples
45 stars 13 forks source link

no results when test VirTect with example data #5

Open MinS1 opened 4 years ago

MinS1 commented 4 years ago

Hi, I have installed VirTect and dependent tools. When I test VirTect using "bash Run_test_VirTect.sh", I didn't obtain some results with the test data. The run process is prompted with a warning("Warning: Encountered reference sequence with only gaps") and ends with the following error:

awk: cmd. line:1: { if ($2!=(ploc+1)) {if (ploc!=0){printf("%s %d-%d awk: cmd. line:1: ^ unterminated string awk: cmd. line:1: { if ($2!=(ploc+1)) {if (ploc!=0){printf("%s %d-%d awk: cmd. line:1: ^ syntax error The continous length ----------------------------------------Note: There is no real virus in the sample :)-------------------

Is this normal or something wrong? Any help greatly appreciated. Thank you very much!

AtlasCUMC commented 4 years ago

Thank you for using VirTect.

I think, you should use your own real data since this test data don't have any virus. You installation is seems right, just you need to use for your real data.

Thanks, Atlas.

On Wed, May 20, 2020 at 11:37 PM MinS notifications@github.com wrote:

Hi, I have installed VirTect and dependent tools. When I test VirTect using "bash Run_test_VirTect.sh", I didn't obtain some results with the test data. The run process is prompted with a warning("Warning: Encountered reference sequence with only gaps") and ends with the following error:

awk: cmd. line:1: { if ($2!=(ploc+1)) {if (ploc!=0){printf("%s %d-%d awk: cmd. line:1: ^ unterminated string awk: cmd. line:1: { if ($2!=(ploc+1)) {if (ploc!=0){printf("%s %d-%d awk: cmd. line:1: ^ syntax error The continous length ----------------------------------------Note: There is no real virus in the sample :)-------------------

Is this normal or something wrong? Any help greatly appreciated. Thank you very much!

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/WGLab/VirTect/issues/5, or unsubscribe https://github.com/notifications/unsubscribe-auth/AES4WPWV62PHMNWNW7P33UDRSSOYVANCNFSM4NGQFVJA .

MinS1 commented 4 years ago

Hi, @AtlasCUMC Thanks for your quick reply! Yes, I ran it on real data last night. Now the operation has not finished, I would like to ask how long it will take for a paired-ended RNAseq data to run? Then whether the software is suitable for single-ended RNAseq data and other NGS data, such as WES and WGS. Thank you~

Best, Shi

AtlasCUMC commented 4 years ago

No Problem!

For ~20M reads VirTect will require ~20 hours.... For now it is suitable for paired end RNAseq but can update for single end in later version. Unfortunately, it is only available for RNA seq so far...

Thanks,

Atlas.

On Thu, May 21, 2020 at 12:00 PM MinS notifications@github.com wrote:

Hi, @AtlasCUMC https://github.com/AtlasCUMC Thanks for your quick reply! Yes, I ran it on real data last night. Now the operation has not finished, I would like to ask how long it will take for a paired-ended RNAseq data to run? Then whether the software is suitable for single-ended RNAseq data and other NGS data, such as WES and WGS. Thank you~

Best, Shi

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/WGLab/VirTect/issues/5#issuecomment-632176470, or unsubscribe https://github.com/notifications/unsubscribe-auth/AES4WPVAWCZOPIH2YFU5MH3RSVFZPANCNFSM4NGQFVJA .

MinS1 commented 4 years ago

Ok, I see. Thank you very much~

MinS1 commented 4 years ago

Hi, @AtlasCUMC After running virTect on real paired-end RNAseq data. No viruses in Final_continous_region.txt. I have test 8 samples. Each sample has no virus results. The size of output files as follows:

4.2G ./accepted_hits.bam 4.0K ./align_summary.txt 0 ./continuous_region.txt 5.7M ./deletions.bed 0 ./Final_continous_region.txt 5.3M ./insertions.bed 16M ./junctions.bed 204K ./logs 4.0K ./prep_reads.info 19M ./unmapped_aln.bam 82M ./unmapped_aln.sam 18M ./unmapped_aln_sorted.bam 91M ./unmapped.bam 34M ./unmapped_sorted_1.fq 34M ./unmapped_sorted_2.fq 88M ./unmapped_sorted.bam 4.0K ./unmapped_viruses_count.txt

No other errors occurred when running virTect, but this in awk :

awk: cmd. line:1: { if ($2!=(ploc+1)) {if (ploc!=0){printf("%s %d-%d awk: cmd. line:1: ^ unterminated string awk: cmd. line:1: { if ($2!=(ploc+1)) {if (ploc!=0){printf("%s %d-%d awk: cmd. line:1: ^ syntax error awk: cmd. line:1: (FILENAME=- FNR=158) fatal: print to "standard output" failed (Broken pipe)

I want to know if there is a problem with my running, or if there is really no virus sequence in the sample? Thank you!

Best, Shi

AtlasCUMC commented 4 years ago

Hi, Thank you for reaching out. I think, there is no real virus, however I am not sure what virus you are looking for you...since VirTect detect only these viruses from the databases https://github.com/WGLab/VirTect/blob/master/List_of_all_virus_genomes.xlsx

Thanks, Atlas.

Atlas Khan, M.Sc., M.Phil, PhD

Department of Medicine Division of Nephrology Columbia University Medical Center (CUMC), New York, USA atlas.akhan@gmail.com ak4046@cumc.columbia.edu Tel: 212-851-5216

On Fri, May 22, 2020 at 10:16 AM MinS notifications@github.com wrote:

Hi, @AtlasCUMC https://github.com/AtlasCUMC After running virTect on real paired-end RNAseq data. No viruses in Final_continous_region.txt. I have test 8 samples. Each sample has no virus results. The size of output files as follows:

4.2G ./accepted_hits.bam 4.0K ./align_summary.txt 0 ./continuous_region.txt 5.7M ./deletions.bed 0 ./Final_continous_region.txt 5.3M ./insertions.bed 16M ./junctions.bed 204K ./logs 4.0K ./prep_reads.info http://prep_reads.info 19M ./unmapped_aln.bam 82M ./unmapped_aln.sam 18M ./unmapped_aln_sorted.bam 91M ./unmapped.bam 34M ./unmapped_sorted_1.fq 34M ./unmapped_sorted_2.fq 88M ./unmapped_sorted.bam 4.0K ./unmapped_viruses_count.txt

No other errors occurred when running virTect, but this in awk :

awk: cmd. line:1: { if ($2!=(ploc+1)) {if (ploc!=0){printf("%s %d-%d awk: cmd. line:1: ^ unterminated string awk: cmd. line:1: { if ($2!=(ploc+1)) {if (ploc!=0){printf("%s %d-%d awk: cmd. line:1: ^ syntax error awk: cmd. line:1: (FILENAME=- FNR=158) fatal: print to "standard output" failed (Broken pipe)

I want to know if there is a problem with my running, or if there is really no virus sequence in the sample? Thank you!

Best, Shi

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/WGLab/VirTect/issues/5#issuecomment-632715898, or unsubscribe https://github.com/notifications/unsubscribe-auth/AES4WPR7YHW77FY6D3W56J3RS2CL3ANCNFSM4NGQFVJA .

MinS1 commented 4 years ago

Thanks for your reply! That is the virus database I used.