AmpliconSuite / AmpliconSuite-pipeline

A quickstart tool for AmpliconArchitect. Performs all preliminary steps (alignment, CNV calling, seed interval detection) required prior to running AmpliconArchitect. Previously called PrepareAA.
Other
53 stars 28 forks source link

coverage.stats file does not exist? #8

Closed alhafidzhamdan closed 4 years ago

alhafidzhamdan commented 4 years ago

Dear all,

I cloned AA from jluebeck's repository and downloaded hg38 data and stored it in data_repo as described. However when i run

python AmpliconArchitect.py --bam ../../../WGS/alignments/E26T/E26T/E26T-ready.bam --bed ../../../WGS/variants/bcbio/E26/E26T/E26-cnvkit.bed --out E26 --ref GRCh38

I got

IOError: [Errno 2] No such file or directory: '/exports/igmm/eddie/lung-WGS/scripts/AmpliconArchitect/data_repo/coverage.stats'

I wonder if i needed to have "coverate.stats" file in my data_repo directory and if so how would i obtain it?

Thanks for your help!

jluebeck commented 4 years ago

Hi,

The very first version of the data repo I uploaded for hg38 had left out coverage.stats, and it has since been corrected.

Assuming that you have that version of the data repo, you can either re-download it or you can fix the issue easily by creating an empty file "coverage.stats" with global write permissions. AA populates this file as you run AA, so it starts as empty.

To create the empty file:

cd $AA_DATA_REPO
# perhaps do 'ls' to make sure there are three folders present here
touch coverage.stats
chmod +rw coverage.stats

Please let me know if you run into any more errors!

alhafidzhamdan commented 4 years ago

Thanks for your quick response. Amazing stuff. Seems to be running OK so far. I'm testing it on a sorted bam file (60X) and CNVKit called bed file. How long does it take usually?

jluebeck commented 4 years ago

Glad to hear it seems to be working!

It's very hard to predict the runtime for a sample. Sometimes it will finish very quickly if there are no amplicons or only a single trivial amplicon. However, the runtime can reach into the range of hundreds of hours for extremely complex samples. In my experience, if it isn't done in 8 hours you're in for a longer ride.

Viraj is working on improvements to runtime in these extreme cases, however the improvements are still very buggy. I will update PrepareAA and my fork as soon as they are finished.