Closed 14zac2 closed 3 years ago
Hi Zoé,
I might have an idea of where the problem is coming from : in Viral-Track, chromosomes from the host are removed (line 307). The problem is that the name of those chromosomes can change based on the reference genome/species. I would recommend to update the object "Chromosome_to_remove" in line 283 accordingly with the correct chromosome names !
Hope this will work !
Best
Pierre
Hi Pierre,
Thanks so much for the tip, it worked! It might be worth it to update the code to be able to accommodate multiple hosts. For anyone else referencing this issue, I fixed the script (for my situation) with a few lines of bash code and my host genome fasta file:
# Create list of chromosomes from my fasta file (host.fa)
grep ">" host.fa > chroms.txt
sed -i 's/>//g' chroms.txt
awk '{ print "\""$0"\""}' chroms.txt > chromQuote.txt
paste -s -d, chromQuote.txt > chromQuoteComma.txt
awk '{ print "Chromosome_to_remove = c("$0")"}' chromQuoteComma.txt > formattedChromList.txt
# The following assumes the Viral-Track code is located in ~
# Inserts the file I just created into the appropriate section of the code
sed -e '/KI270394.1/r formattedChromList.txt' ~/Viral-Track/Viral_Track_scanning.R > ~/Viral-Track/Viral_Track_scanning_2.R
# Now run as normal but call on Viral_Track_scanning_2.R
Many thanks again! Zoe
@PierreBSC , @14zac2 Hi Zoe, thanks alot for the update, I have tried what you mentioned and I am getting another error:
Aug 04 12:43:51 ..... started STAR run
Aug 04 12:43:51 ..... loading genome
Aug 04 12:45:00 ..... started 1st pass mapping
Aug 04 12:45:24 ..... finished 1st pass mapping
Aug 04 12:45:25 ..... inserting junctions into the genome indices
Aug 04 12:47:00 ..... started mapping
Aug 04 12:47:26 ..... finished mapping
Aug 04 12:47:31 ..... started sorting BAM
Aug 04 12:47:39 ..... finished successfully
Mapping ofhgmm_100_R2_extracted.fastq done !
All fastq files have been mapped successfully
Starting the BAM file analysis
Indexing of the bam file for hgmm_100_R2_extracted is done
Computing stat file for the bam file for hgmm_100_R2_extracted is done
Checking the mapping quality of each virus...
Export of the viral SAM file done for hgmm_100_R2_extracted
Error in { : task 1 failed - "different row counts implied by arguments"
Calls: %dopar% ->
May you please help me with the issue?
Hello!
I'm trying to use Viral-Track to analyze single-cell woodchuck liver samples infected with the woodchuck hepatitis virus. When running your program, I experience the same error as mentioned #4 . Although R seemed to recognize the
%dopar%
command when running through the code line-by-line, that step was consistently throwing giving the errorError in unserialize(socklist[[n]]) : error reading from connection
. As you recommended in the thread, I tried changing both parallel loops involving%dopar%
to regularfor
loops, but now I am experiencing a new error:Do you have any idea as to why this might be occurring? I am worried that it might be due to my host organism which consists of 3123 contigs. I did all the UMITools preprocessing requested and my package versions are
r/4.0.2 samtools/1.10 star/2.7.8a stringtie/2.1.3
plus R packages as follows:Please let me know if I can provide any other information, and I look forward to hearing from you!