broadinstitute / ichorCNA

Estimating tumor fraction in cell-free DNA from ultra-low-pass whole genome sequencing.
GNU General Public License v3.0
158 stars 88 forks source link

ichorCNA failing to load a file from ftp.ncbi.nlm.nih.gov #88

Open ury opened 3 years ago

ury commented 3 years ago

Hi, I'm running ichorCNA from a cloud environment and occasionally I'm receiving the following error:

> cannot open the connection to 'ftp://ftp.ncbi.nlm.nih.gov/genomes/all/GCF/000/001/405/GCF_000001405.13_GRCh37/GCF_000001405.13_GRCh37_assembly_report.txt'
> Calls: getSeqInfo ... .fetch_assembly_report_from_URL -> read.table -> file
> In addition: Warning messages:
> 1: version 0.26.1 of 'S4Vectors' masked by 0.24.4 in /usr/local/lib/R/site-library
> 2: In file(file, "rt") :
> URL 'ftp://ftp.ncbi.nlm.nih.gov/genomes/all/GCF/000/001/405/GCF_000001405.13_GRCh37/GCF_000001405.13_GRCh37_assembly_report.txt': status was 'Failure when receiving data from the peer'
> 

The command I'm using:

> Rscript /opt/bin/ichorCNA/scripts/runIchorCNA.R  \
>   --libdir /opt/bin/ichorCNA  \
>   --id ${PATIENT_ID}.${SAMPLE_ID}  \
>   --WIG ${PATIENT_ID}.${SAMPLE_ID}.wig  \
>   --gcWig /opt/bin/ichorCNA/inst/extdata/gc_hg38_1000kb.wig  \
>   --mapWig /opt/bin/ichorCNA/inst/extdata/map_hg38_1000kb.wig  \
>   --ploidy 2  \
>   --normal \"${NORMAL}\"  \
>   --normalPanel /opt/bin/ichorCNA/inst/extdata/HD_ULP_PoN_hg38_1Mb_median_normAutosome_median.rds  \
>   --maxCN 4  \
>   --includeHOMD FALSE  \
>   --chrs \"c(1:22)\"  \
>   --chrTrain \"c(1:22)\"  \
>   --estimateNormal TRUE  \
>   --estimateScPrevalence TRUE  \
>   --scStates \"c()\"  \
>   --txnE 0.9999  \
>   --txnStrength 10000  \
>   --genomeStyle UCSC  \
>   --centromere /opt/bin/ichorCNA/inst/extdata/GRCh38.GCA_000001405.2_centromere_acen.txt  \
>   --outDir ./

It's important to mention that ichorCNA is running from a Docker container, and any downloaded file is not persisted to the Docker image - so this is not a one time thing in any case.

I have a couple of questions regarding the above error:

  1. Why is ichorCNA attempting to download GCF_000001405.13_GRCh37_assembly_report.txt in the first place?
  2. Is the sporadic connectivity issue a known one?
  3. How can I prevent ichorCNA from attempting to download files? If they are required, can I preemptively download them to a local folder which ichorCNA will use?

Thanks, Ury