cancerit / ascatNgs

Somatic copy number analysis using WGS paired end wholegenome sequencing
http://cancerit.github.io/ascatNgs/
GNU Affero General Public License v3.0
68 stars 17 forks source link

cannot found ascat.R file #91

Closed wanhui5867 closed 4 years ago

wanhui5867 commented 4 years ago

Hi,

When I running the ascat.pl, I caught an error that ascat.R file was used but was not provide in the current ascatNgs package. The detailed error as following:

` Errors from command: cd ./ascat/output/result/tmpAscat/ascat; /usr/bin/Rscript ~/software/ascatNgs/perl/bin/../share/ascat/runASCAT.R ~/software/ascatNgs/perl/bin/../share/ascat ./ascat/output/result/tmpAscat/SnpPositions.tsv ./ascat/output/result/tmpAscat/SnpGcCorrections.tsv MC1D MC1D.count MC1PBL MC1PBL.count XY 24 ./ascat/output/result/tmpAscat/ascat/MC1D.Rdata "c('1','10','11','12','13','14','15','16','17','18','19','2','20','21','22','3','4','5','6','7','8','9','X','Y')" "c('1','10','11','12','13','14','15','16','17','18','19','2','20','21','22','3','4','5','6','7','8','9','X','Y')"

Error in file(filename, "r", encoding = encoding) : cannot open the connection Calls: source -> file In addition: Warning message: In file(filename, "r", encoding = encoding) : cannot open file '~/software/ascatNgs/perl/bin/../share/ascat/ascat.R': No such file or directory Execution halted `

Question: Is the runASCAT.R the ascat.R? Or is the ascat.R missing in the current version?

Many thanks!

keiranmraine commented 4 years ago

The setup.sh script pulls this from the ASCAT repository. There is also a pre-build docker images known to work under singularity:

https://quay.io/repository/wtsicgp/ascatngs

wanhui5867 commented 4 years ago

@keiranmraine Dear keiranmraine,

Thanks for your help. The ascat.R problem has solved as do following:

Method1: I used conda to install cancerit-allelecount, cgpvcf, perl-pcap packages, then I run ./setup.sh . . But it raises that allelcount should be install previously while I check the allelCounter -v is 4.0.2 (>3.3.0). So I download ascat.R via the link from setup.sh: "https://raw.githubusercontent.com/Crick-CancerGenomics/ascat/v2.5.1/ASCAT/R/ascat.R" from your answers. Then I rerun ascat.pl, it raises new error:

Errors from command: cd ./exampledata/testData/result/tmpAscat/ascat; /usr/bin/Rscript ~/software/ascatNgs/perl/bin/../share/ascat/runASCAT.R ~/software/ascatNgs/perl/bin/../share/ascat ./exampledata/testData/result/tmpAscat/SnpPositions.tsv ./exampledata/testData/result/tmpAscat/SnpGcCorrections.tsv TESTMUT TESTMUT.count TESTNORM TESTNORM.count XY 24 ./exampledata/testData/result/tmpAscat/ascat/TESTMUT.Rdata "c('1','10','11','12','13','14','15','16','17','18','19','2','20','21','22','3','4','5','6','7','8','9','X','Y')" "c('1','10','11','12','13','14','15','16','17','18','19','2','20','21','22','3','4','5','6','7','8','9','X','Y')"

Error in apply(corr_tot, 1, function(x) sum(abs(x * length_tot))/sum(length_tot)) : 
  dim(X) must have a positive length
Calls: ascat.GCcorrect -> apply
Execution halted

Method2: Then I tried to use docker to run, but it has the same error:

/usr/bin/Rscript /opt/wtsi-cgp/lib/perl5/auto/share/module/Sanger-CGP-Ascat-Implement/ascat/runASCAT.R /opt/wtsi-cgp/lib/perl5/auto/share/module/Sanger-CGP-Ascat-Implement/ascat /home/ubuntu/result/tmpAscat/SnpPositions.tsv /home/ubuntu/result/tmpAscat/SnpGcCorrections.tsv MC1D MC1D.count MC1PBL MC1PBL.count XY 24 /home/ubuntu/result/tmpAscat/ascat/MC1D.Rdata 'c('\''1'\'','\''10'\'','\''11'\'','\''12'\'','\''13'\'','\''14'\'','\''15'\'','\''16'\'','\''17'\'','\''18'\'','\''19'\'','\''2'\'','\''20'\'','\''21'\'','\''22'\'','\''3'\'','\''4'\'','\''5'\'','\''6'\'','\''7'\'','\''8'\'','\''9'\'','\''X'\'','\''Y'\'')' 'c('\''1'\'','\''10'\'','\''11'\'','\''12'\'','\''13'\'','\''14'\'','\''15'\'','\''16'\'','\''17'\'','\''18'\'','\''19'\'','\''2'\'','\''20'\'','\''21'\'','\''22'\'','\''3'\'','\''4'\'','\''5'\'','\''6'\'','\''7'\'','\''8'\'','\''9'\'','\''X'\'','\''Y'\'')'
Error in apply(corr_tot, 1, function(x) sum(abs(x * length_tot))/sum(length_tot)) :
  dim(X) must have a positive length
Calls: ascat.GCcorrect -> apply
Execution halted
Command exited with non-zero status 1
403.56user 6.30system 7:13.58elapsed 94%CPU (0avgtext+0avgdata 1880184maxresident)k
2146632inputs+0outputs (33213major+755293minor)pagefaults 0swaps

Both testData from you and my own data were test, but they have same error. I don't know why so I'd like to ask your help again.

Thanks for your kindly help.

keiranmraine commented 4 years ago

Hi,

We do not support conda packaging issues.

The error you now have is a data error. Does you input reference/BAM/CRAM have chr prefixes on chromosomes? By default ascatNgs (and the ASCAT R library) expect 1,2,3... This is covered in the wiki.

keiranmraine commented 4 years ago

... If the reference is GRCh37.

wanhui5867 commented 4 years ago

Thanks for your quick reply.

Do you mean this is caused by the file: reference/genome.fa? I tried to download from ftp://ftp.sanger.ac.uk/pub/cancer/support-files/CPIB/ascatNgs/Human/GRCh37/, while the genome.fa file is a link file, so I download it from ftp://ftp.sanger.ac.uk/pub/cancer/dockstore/human/core_ref_GRCh37d5.tar.gz

it contains header as following: grep '>' genome.fa image

so I need to add 'chr' to each header, right?

wanhui5867 commented 4 years ago

Well, are you mean my BAM file should not include 'chr' predix? Here is my input bam file looks like: image

Do you have good suggestions to quickly modify the BAM files? or modify ascat.R?

keiranmraine commented 4 years ago

You need to make the reference files match your BAM file. As indicated in the wiki link I posted you need to update all of the reference files, not just the fa/fai but those under the ascat folder from CNV_SV_ref_GRCh37d5_brass6+.tar.gz as well.

You could do the BAM files but that is incredibly compute intensive and you'd need to do it for each file while updating the reference files is a one-time exercise.

wanhui5867 commented 4 years ago

@keiranmraine Your suggestion is very useful. I add chr predix to genome.fa & genome.fa.fai & SnpSnpGcCorrections.tsv and use -genderChr chrY . Finally, it runs successfully. Thank you very much!

The last question, my samples are from male, I found the germline logR plot that the line was not all in y_axis=0, XY regions are in -1 (as following). I read the ascatNgs paper, the showed pictures are from female samples, so I am not sure that my result is right for male samples? image

keiranmraine commented 4 years ago

As I understand this is the expected result. X is -1 from the normal as you are diploid across the genome and have Y. Y is not called AFAIK. We don not normally present the germline images to our users as they only confirm that you have run a matched sample.

Further queries on the graphs are probably more appropriately directed to the developers of the underlying R code of the algorithm:

https://github.com/Crick-CancerGenomics/ascat

wanhui5867 commented 4 years ago

Got it! Thanks again.