ENCODE-DCC / chip-seq-pipeline2

ENCODE ChIP-seq pipeline
MIT License
241 stars 123 forks source link

error installing conda environment #115

Closed gevirl closed 4 years ago

gevirl commented 4 years ago

Describe the bug I get this error when running install_conda_env.sh

PaddingError: Placeholder of length '80' too short in package /net/waterston/vol2/home/waterston-jboss/miniconda3/envs/encode-chip-seq-pipeline/bin/Rscript. The package must be rebuilt with conda-build > 2.0.

I have installed miniconda and conda --version is: conda 4.8.0

leepc12 commented 4 years ago

This is Conda's issue and it's due to some packages compiled wrongly. This already occurred for some recent Conda builds 4.7.x. Try with Conda < 4.7. e.g. 4.6.x.

gevirl commented 4 years ago

So I downgraded conda to version 4.6.14 and got the same error

leepc12 commented 4 years ago

There are two ways to fix this problem:

1) Install Conda on a path shorter than this /net/waterston/vol2/home/waterston-jboss/. For example, /home/$USER/.

2) Replace scripts/requirements.txt with this file. But the output will be very slightly different due to different software/library versions. This will generate exactly the same output.

# for python3
# conda repos: defaults, r, bioconda, conda-forge

nomkl  # using MKL can change MACS2 output randomly on different platforms

idr ==2.0.4.2
tabix
samtools ==1.9
htslib ==1.9
sambamba ==0.6.6
samstats ==0.2.1
bedtools ==2.29.0
picard ==2.20.7

ucsc-fetchchromsizes ==357  # 377 in docker/singularity image
ucsc-wigtobigwig ==357
ucsc-bedgraphtobigwig ==357
ucsc-bigwiginfo ==357
ucsc-bedclip ==357
ucsc-bedtobigbed ==357
ucsc-twobittofa ==357
ucsc-bigWigAverageOverBed ==357

r ==3.5.1  # 3.4.4 in docker/singularity image
r-snow
r-snowfall
r-bitops
r-catools
bioconductor-rsamtools
r-spp  # 1.15 in docker/singularity image
libiconv

bwa ==0.7.17
bowtie2 ==2.3.4.3
pysam ==0.15.3
pybedtools ==0.8.0
phantompeakqualtools ==1.2.1
pybigwig ==0.3.13
openssl ==1.0.2t  # important to get the same random seed for "shuf"
deeptools ==3.3.1
cutadapt ==2.5
preseq ==2.0.3
pyfaidx ==0.5.5.2

macs2 ==2.2.4

jsondiff ==1.1.1
libgcc
requests
ncurses
gnuplot
ghostscript
wget
pyopenssl  # for caper/croo to presign bucket URIs
grep
tar

numpy
scipy
pandas
jinja2
gsl
matplotlib

java-jdk

caper
croo

See this for details.

gevirl commented 4 years ago

I used the above experiment.txt file to build the conda environment. Since no version of r-spp is specified, bioconda install versionr- spp 1.16.0

This version failed with error (copied from caper troubleshoot metadata.json):

chip.call_peak_ppr1 RetryableFailure. SHARD_IDX=-1, RC=1, JOB_ID=95809837, RUN_START=2019-12-25T04:32:19.101Z, RUN_END=2019-12-25T05:47:14.232Z, STDOUT=/net/waterston/vol9/ChipSeqPipeline/C06A6.2_OP810_youngadult_1/chip/9c7bf1a0-1e2c-4119-9744-2d3789573ddf/call-call_peak_ppr1/execution/stdout, STDERR=/net/waterston/vol9/ChipSeqPipeline/C06A6.2_OP810_youngadult_1/chip/9c7bf1a0-1e2c-4119-9744-2d3789573ddf/call-call_peak_ppr1/execution/stderr STDERR_CONTENTS= Traceback (most recent call last): File "/net/waterston/vol2/home/waterston-jboss/miniconda3/envs/encode-chip-seq-pipeline/bin/encode_task_spp.py", line 103, in main() File "/net/waterston/vol2/home/waterston-jboss/miniconda3/envs/encode-chip-seq-pipeline/bin/encode_task_spp.py", line 91, in main args.fraglen, args.cap_num_peak, args.nth, args.out_dir) File "/net/waterston/vol2/home/waterston-jboss/miniconda3/envs/encode-chip-seq-pipeline/bin/encode_task_spp.py", line 66, in spp run_shell_cmd(cmd0) File "/net/waterston/vol2/home/waterston-jboss/miniconda3/envs/encode-chip-seq-pipeline/bin/encode_lib_common.py", line 319, in run_shell_cmd raise Exception(err_str) Exception: PID=12610, PGID=12610, RC=1 STDERR=Loading required package: Rcpp Error in window.chr.call.mirror.binding(list(ctv = tvl[[chr]], bg.ctv = bg.ctv, : object 'tag.lwcc' not found Calls: find.binding.positions ... window.call.mirror.binding -> lapply -> FUN -> window.chr.call.mirror.binding Execution halted

leepc12 commented 4 years ago

tag.lwcc error is already fixed in the latest phantompeakqualtools 1.2.2.

$ conda activate encode-chip-seq-pipeline
$ conda install phantompeakqualtools==1.2.2 -c bioconda
leepc12 commented 4 years ago

1) 80 length Conda error Did you change git branch to PIP-835_updates_for_v1.3.4? Please check line 24 of scripts/requirements.txt. It should be "r ==3.5.1". If not, run "git checkout PIP-835_updates_for_v1.3.4"

$ cd [PIPELINE_GIT_DIR] $ git checkout PIP-835_updates_for_v1.3.4

2) OpenJDK 64-Bit Server VM (11.0.1-internal+0-adhoc..src) Yes, I used Java in a Conda env but I also tried with "module load java" and got the same error.

3) I found that it fails if it runs long (~several hours). I just tested with a very small sample and it worked fine. I used the same SLURM settings but used subsampled FASTQ files. $ sacct -j 57857703 --format=start,end Start End


2020-01-08T23:39:41 2020-01-08T23:52:08 2020-01-08T23:39:41 2020-01-08T23:52:08 2020-01-08T23:39:41 2020-01-08T23:52:09

I think this problem has something to do with the length of the job. Is there any limit for polling (squeue) for a session or on a node?

4) "which gcc" returns "/usr/bin/gcc"

(encode-chip-seq-pipeline) [leepc12@sh-ln02 login /oak/stanford/groups/akundaje/leepc12/run_chip_paper/cromwell_raw_output/toy]$ which gcc /usr/bin/gcc

5) I think amd64 is a general name for 64-bit arch.

Jin

On Fri, Dec 27, 2019 at 1:44 AM gevirl notifications@github.com wrote:

I used the above experiment.txt file to build the conda environment. Since no version of r-spp is specified, bioconda install versionr- spp 1.16.0

This version failed with error (copied from caper troubleshoot metadata.json):

chip.call_peak_ppr1 RetryableFailure. SHARD_IDX=-1, RC=1, JOB_ID=95809837, RUN_START=2019-12-25T04:32:19.101Z, RUN_END=2019-12-25T05:47:14.232Z, STDOUT=/net/waterston/vol9/ChipSeqPipeline/C06A6.2_OP810_youngadult_1/chip/9c7bf1a0-1e2c-4119-9744-2d3789573ddf/call-call_peak_ppr1/execution/stdout, STDERR=/net/waterston/vol9/ChipSeqPipeline/C06A6.2_OP810_youngadult_1/chip/9c7bf1a0-1e2c-4119-9744-2d3789573ddf/call-call_peak_ppr1/execution/stderr STDERR_CONTENTS= Traceback (most recent call last): File "/net/waterston/vol2/home/waterston-jboss/miniconda3/envs/encode-chip-seq-pipeline/bin/encode_task_spp.py", line 103, in main() File "/net/waterston/vol2/home/waterston-jboss/miniconda3/envs/encode-chip-seq-pipeline/bin/encode_task_spp.py", line 91, in main args.fraglen, args.cap_num_peak, args.nth, args.out_dir) File "/net/waterston/vol2/home/waterston-jboss/miniconda3/envs/encode-chip-seq-pipeline/bin/encode_task_spp.py", line 66, in spp run_shell_cmd(cmd0) File "/net/waterston/vol2/home/waterston-jboss/miniconda3/envs/encode-chip-seq-pipeline/bin/encode_lib_common.py", line 319, in run_shell_cmd raise Exception(err_str) Exception: PID=12610, PGID=12610, RC=1 STDERR=Loading required package: Rcpp Error in window.chr.call.mirror.binding(list(ctv = tvl[[chr]], bg.ctv = bg.ctv, : object 'tag.lwcc' not found Calls: find.binding.positions ... window.call.mirror.binding -> lapply -> FUN -> window.chr.call.mirror.binding Execution halted

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/ENCODE-DCC/chip-seq-pipeline2/issues/115?email_source=notifications&email_token=ACBZ37HSI4AUAYX56Y7TZW3Q2TNQHA5CNFSM4J3MNRT2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEHVZUQA#issuecomment-569088576, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACBZ37GNLYJU2VJQBVDUMGTQ2TNQHANCNFSM4J3MNRTQ .

leepc12 commented 4 years ago

Fixed.