Closed yingsun-ucsd closed 5 years ago
It looks like the estimated fragment length is negative. This pipeline is no longer maintained. Please use https://github.com/ENCODE-DCC/chip-seq-pipeline2. I think this issue is already fixed in the new pipeline.
Hi Jin,
Thanks for your reply. But unfortunately, I got lost. Would you please send me the link on how to run it? For example, I used to run it as "bds /home/ysun/TF_chipseq_pipeline/chipseq.bds”. Thank you so much for your help.
Best, Ying
On Oct 25, 2019, at 12:09 PM, Jin Lee notifications@github.com<mailto:notifications@github.com> wrote:
It looks like the estimated fragment length is negative. This pipeline is no longer maintained. Please use https://github.com/ENCODE-DCC/chip-seq-pipeline2. I think this issue is already fixed in the new pipeline.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/kundajelab/chipseq_pipeline/issues/48?email_source=notifications&email_token=AHEKV7ED6FLFGMGPKPD5UZ3QQM75BA5CNFSM4JE7KTU2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOECJJCPA#issuecomment-546476348, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AHEKV7FK2UD3SPL7UREF6MLQQM75BANCNFSM4JE7KTUQ.
Please read through README on https://github.com/ENCODE-DCC/chip-seq-pipeline2.
Dear Jin,
I followed the link you suggested and installed everything. What I am trying to do is to run chip-seq-pipeline2 on our local Linux server. Unfortunately, I got the following error message which I cannot figure out how to fix it. I don’t quite get what [wdl] should I use here. Confused. Would you please help me to solve it? Just for your convenience, I also attached my json file below.
Thank you very much for your time and help. I really appreciate it a lot!
Best, Ying
testHg19.json { "chip.title" : "H3K27acB1 (single-end)", "chip.description" : "This is a test run of B1 H3K27ac for GSE128072 for single-end sample.",
"chip.pipeline_type" : "histone",
"chip.aligner" : "bowtie2",
"chip.align_only" : false,
"chip.true_rep_only" : false,
"chip.genome_tsv" : "/nfs/lab/ysun/chip-seq-pipeline2-genome/hg19/hg19.tsv",
"chip.paired_end" : true,
"chip.ctl_paired_end" : true,
"chip.always_use_pooled_ctl" : false,
"chip.fastqs_rep1" : [ "SRR8701823.fastq.gz", "SRR8701824.fastq.gz", "SRR8701825.fastq.gz", "SRR8701826.fastq.gz", "SRR8701827.fastq.gz", "SRR8701828.fastq.gz", "SRR8701829.fastq.gz"
, "SRR8701830.fastq.gz", "SRR8701831.fastq.gz", "SRR8701832.fastq.gz", "SRR8701833.fastq.gz", "SRR8701834.fastq.gz" ],
"chip.ctl_fastqs_rep1" : [ "SRR8702202.fastq.gz", "SRR8702203.fastq.gz", "SRR8702204.fastq.gz", "SRR8702205.fastq.gz", "SRR8702206.fastq.gz", "SRR8702207.fastq.gz", "SRR8702208.fastq
.gz" ] }
On Oct 25, 2019, at 3:28 PM, Jin Lee notifications@github.com<mailto:notifications@github.com> wrote:
Please read through README on https://github.com/ENCODE-DCC/chip-seq-pipeline2.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/kundajelab/chipseq_pipeline/issues/48?email_source=notifications&email_token=AHEKV7ADCMZ7TDSRV3MLE5DQQNXH3A5CNFSM4JE7KTU2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOECJW7DY#issuecomment-546533263, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AHEKV7GXQARW3RKNKFCAFA3QQNXH3ANCNFSM4JE7KTUQ.
It should be caper run chip-seq-pipeline2/chip.wdl -i testHg19.json
Thanks for your prompt respond. I changed the command as "caper run chip.wdl -i testHg19.json”. And I also changed an error in my .json file (single-end instead of paired-end). But I still got the following error message. Any suggestions will be appreciated. Thanks again.
Best, Ying
================================================================================
(encode-chip-seq-pipeline) [ysun@gatsby sub1]$ caper run chip.wdl -i testHg19.json
[CaperURI] read from local, src: /nfs/lab/ysun/GSE128072_ChIP/sub1/testHg19.json
Traceback (most recent call last):
File "/home/ysun/miniconda3/envs/encode-chip-seq-pipeline/bin/caper", line 13, in
On Dec 5, 2019, at 12:37 PM, Jin Lee notifications@github.com<mailto:notifications@github.com> wrote:
It should be caper run chip-seq-pipeline2/chip.wdl -i testHg19.json
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/kundajelab/chipseq_pipeline/issues/48?email_source=notifications&email_token=AHEKV7HYESLMISBLFYGKVUTQXFQ7XA5CNFSM4JE7KTU2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEGCCAGI#issuecomment-562307097, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AHEKV7B4ZOYNBVTQLGICA53QXFQ7XANCNFSM4JE7KTUQ.
Check line 13 column 29
of your input JSON. Check commas, brackets, ... For example, TAB is not allowed in JSON.
Thanks. I caught the typo and right now it’s running as following. Does it look normal to you? Thanks.
Best, Ying
================================ [ysun@gatsby sub1]$ conda activate encode-chip-seq-pipeline (encode-chip-seq-pipeline) [ysun@gatsby sub1]$ caper run chip.wdl -i testHg19.json [CaperURI] read from local, src: /nfs/lab/ysun/GSE128072_ChIP/sub1/testHg19.json [CaperURI] read from local, src: /nfs/lab/ysun/chip-seq-pipeline2-genome/hg19/hg19.tsv [CaperURI] copying from url to local, src: https://github.com/broadinstitute/cromwell/releases/download/47/cromwell-47.jar [CaperURI] wait 30 sec for file being unlocked. retries: 1, max_retries: 100. uri: /home/ysun/.caper/cromwell_jar/cromwell-47.jar [CaperURI] wait 30 sec for file being unlocked. retries: 2, max_retries: 100. uri: /home/ysun/.caper/cromwell_jar/cromwell-47.jar [CaperURI] wait 30 sec for file being unlocked. retries: 3, max_retries: 100. uri: /home/ysun/.caper/cromwell_jar/cromwell-47.jar [CaperURI] wait 30 sec for file being unlocked. retries: 4, max_retries: 100. uri: /home/ysun/.caper/cromwell_jar/cromwell-47.jar [CaperURI] wait 30 sec for file being unlocked. retries: 5, max_retries: 100. uri: /home/ysun/.caper/cromwell_jar/cromwell-47.jar
On Dec 5, 2019, at 2:39 PM, Jin Lee notifications@github.com<mailto:notifications@github.com> wrote:
Check line 13 column 29 of your input JSON. Check commas, brackets, ... For example, TAB is not allowed in JSON.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/kundajelab/chipseq_pipeline/issues/48?email_source=notifications&email_token=AHEKV7B7TDU7IBWBKV6XOKTQXF7KVA5CNFSM4JE7KTU2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEGCMU3A#issuecomment-562350700, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AHEKV7F2CVT2WAQOQGSDT5TQXF7KVANCNFSM4JE7KTUQ.
Delete lock file /home/ysun/.caper/cromwell_jar/*.lock
and try again. Make sure that you have only one process to run a pipeline.
It’s running. Thanks!
On Dec 5, 2019, at 3:44 PM, Jin Lee notifications@github.com<mailto:notifications@github.com> wrote:
Delete lock file /home/ysun/.caper/cromwell_jar/*.lock and try again. Make sure that you have only one process to run a pipeline.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/kundajelab/chipseq_pipeline/issues/48?email_source=notifications&email_token=AHEKV7EYOXU6ZCMVJDE3FFLQXGG7DA5CNFSM4JE7KTU2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEGCRARA#issuecomment-562368580, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AHEKV7C3MOXLSR6NLO5VICTQXGG7DANCNFSM4JE7KTUQ.
Hi Jin,
I am trying to use Croo to organize chip-seq-pipeline2 output as following but got error message too. Please advise what I should do to fix ti. Thank you so much and have a nice weekend!
Best, Ying
===================
(encode-chip-seq-pipeline) [ysun@gatsby GSE128072_ChIP]$ croo /nfs/lab/ysun/GSE128072_ChIP/chip/d531476f-00a3-488f-8589-5d2653b385d4/call-qc_report/execution/qc.json[this is the only json file I can find in the output folder. Is this the right on e to use?] --out-def-json summaryTestHg19.json --out-dir testBucket
Traceback (most recent call last):
File "/home/ysun/miniconda3/envs/encode-chip-seq-pipeline/bin/croo", line 13, in
On Dec 5, 2019, at 3:44 PM, Jin Lee notifications@github.com<mailto:notifications@github.com> wrote:
Delete lock file /home/ysun/.caper/cromwell_jar/*.lock and try again. Make sure that you have only one process to run a pipeline.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/kundajelab/chipseq_pipeline/issues/48?email_source=notifications&email_token=AHEKV7EYOXU6ZCMVJDE3FFLQXGG7DA5CNFSM4JE7KTU2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEGCRARA#issuecomment-562368580, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AHEKV7C3MOXLSR6NLO5VICTQXGG7DANCNFSM4JE7KTUQ.
Find metadata.json
instead of qc.json
.
Please advise. Thank you so much.
Best, Ying
On Dec 6, 2019, at 7:14 PM, Jin Lee notifications@github.com<mailto:notifications@github.com> wrote:
Find metadata.json instead of qc.json.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/kundajelab/chipseq_pipeline/issues/48?email_source=notifications&email_token=AHEKV7E6KMSV47LVIO4RTQTQXMIIVA5CNFSM4JE7KTU2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEGF4M7Y#issuecomment-562808447, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AHEKV7EBZU4BEKZXHM3S7TTQXMIIVANCNFSM4JE7KTUQ.
Another question is: The output has been stored in a default folder like “./chip/e496c521-6b96-47a1-b47c-cb8ab1416730(or similar long string)/“. I am wondering if I can give a more meaningful name instead of using the long name?
Best, Ying
On Dec 6, 2019, at 7:14 PM, Jin Lee notifications@github.com<mailto:notifications@github.com> wrote:
Find metadata.json instead of qc.json.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/kundajelab/chipseq_pipeline/issues/48?email_source=notifications&email_token=AHEKV7E6KMSV47LVIO4RTQTQXMIIVA5CNFSM4JE7KTU2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEGF4M7Y#issuecomment-562808447, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AHEKV7EBZU4BEKZXHM3S7TTQXMIIVANCNFSM4JE7KTUQ.
I tried to go through the output folder but can not find ".pval.signal.bigwig" in "/chip/e496c521-6b96-47a1-b47c-cb8ab1416730//call-macs2_signal_track/shard-0/execution/" but only “.merged.nodup_x_ctl_for_rep1.fc.signal.bigwig”. Would you please let me know what I did wrong here? Thanks a lot!
Best, Ying
On Dec 6, 2019, at 7:14 PM, Jin Lee notifications@github.com<mailto:notifications@github.com> wrote:
Find metadata.json instead of qc.json.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/kundajelab/chipseq_pipeline/issues/48?email_source=notifications&email_token=AHEKV7E6KMSV47LVIO4RTQTQXMIIVA5CNFSM4JE7KTU2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEGF4M7Y#issuecomment-562808447, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AHEKV7EBZU4BEKZXHM3S7TTQXMIIVANCNFSM4JE7KTUQ.
Use croo to organize caper s raw outputs
On Mon, Dec 9, 2019, 11:58 AM yingsun-ucsd notifications@github.com wrote:
Another question is: The output has been stored in a default folder like “./chip/e496c521-6b96-47a1-b47c-cb8ab1416730(or similar long string)/“. I am wondering if I can give a more meaningful name instead of using the long name?
Best, Ying
On Dec 6, 2019, at 7:14 PM, Jin Lee <notifications@github.com<mailto: notifications@github.com>> wrote:
Find metadata.json instead of qc.json.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub< https://github.com/kundajelab/chipseq_pipeline/issues/48?email_source=notifications&email_token=AHEKV7E6KMSV47LVIO4RTQTQXMIIVA5CNFSM4JE7KTU2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEGF4M7Y#issuecomment-562808447>, or unsubscribe< https://github.com/notifications/unsubscribe-auth/AHEKV7EBZU4BEKZXHM3S7TTQXMIIVANCNFSM4JE7KTUQ>.
— You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub https://github.com/kundajelab/chipseq_pipeline/issues/48?email_source=notifications&email_token=ACBZ37AQWUZXUD7Z5LIVQPLQX2PPTA5CNFSM4JE7KTU2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEGKPCRA#issuecomment-563409220, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACBZ37BASHNDRDSHUCNFNCLQX2PPTANCNFSM4JE7KTUQ .
My Croo does not work properly for some reasons.
On Dec 9, 2019, at 11:32 AM, Ying Sun y1sun@ucsd.edu<mailto:y1sun@ucsd.edu> wrote:
Please advise. Thank you so much.
Best, Ying
On Dec 6, 2019, at 7:14 PM, Jin Lee notifications@github.com<mailto:notifications@github.com> wrote:
Find metadata.json instead of qc.json.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/kundajelab/chipseq_pipeline/issues/48?email_source=notifications&email_token=AHEKV7E6KMSV47LVIO4RTQTQXMIIVA5CNFSM4JE7KTU2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEGF4M7Y#issuecomment-562808447, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AHEKV7EBZU4BEKZXHM3S7TTQXMIIVANCNFSM4JE7KTUQ.
Can you upgrade croo to 0.3.3 and try again?
pip install croo==0.3.3
Sorry, but still no luck.
Best, Ying
On Dec 9, 2019, at 1:09 PM, Jin Lee notifications@github.com<mailto:notifications@github.com> wrote:
Can you upgrade croo to 0.3.3 and try again? pip install croo==0.3.3
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/kundajelab/chipseq_pipeline/issues/48?email_source=notifications&email_token=AHEKV7GPVYHFBVGLYCK4XYDQX2XZ7A5CNFSM4JE7KTU2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEGKWVMQ#issuecomment-563440306, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AHEKV7HEIVSNG4S75YVQF43QX2XZ7ANCNFSM4JE7KTUQ.
Can you find/modify your cromwell_metadata.py
like the following?
How to find it:
leepc12@kadru:/users/leepc12/code/croo$ python
Python 3.7.3 (default, Mar 27 2019, 22:11:17)
[GCC 7.3.0] :: Anaconda, Inc. on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import croo
>>> import croo.cromwell_metadata
>>> croo.cromwell_metadata.__file__
'/users/leepc12/code/croo/croo/cromwell_metadata.py'
>>>
Replace this (from line 183)
for output_name, output_path, _ in out_files:
# add each output file to DAG
n = CMNode(
type='output',
shard_idx=shard_idx,
task_name=task_name,
output_name=output_name,
output_path=output_path,
all_outputs=None,
all_inputs=None)
self._dag.add_node(n)
with this
if out_files:
for output_name, output_path, _ in out_files:
# add each output file to DAG
n = CMNode(
type='output',
shard_idx=shard_idx,
task_name=task_name,
output_name=output_name,
output_path=output_path,
all_outputs=None,
all_inputs=None)
self._dag.add_node(n)
If this works, I will make a hot fix.
After the replacement, it works perfectly fine. Thank you so much for your help.
Best, Ying
On Dec 9, 2019, at 2:28 PM, Jin Lee notifications@github.com<mailto:notifications@github.com> wrote:
Can you find/modify your cromwell_metadata.py like the following?
How to find it:
leepc12@kadru:/users/leepc12/code/croo$ python Python 3.7.3 (default, Mar 27 2019, 22:11:17) [GCC 7.3.0] :: Anaconda, Inc. on linux Type "help", "copyright", "credits" or "license" for more information.
import croo import croo.cromwell_metadata croo.cromwell_metadata.file '/users/leepc12/code/croo/croo/cromwell_metadata.py'
Replace this (from line 183)
for output_name, output_path, _ in out_files:
# add each output file to DAG
n = CMNode(
type='output',
shard_idx=shard_idx,
task_name=task_name,
output_name=output_name,
output_path=output_path,
all_outputs=None,
all_inputs=None)
self._dag.add_node(n)
with this
if out_files:
for output_name, output_path, _ in out_files:
# add each output file to DAG
n = CMNode(
type='output',
shard_idx=shard_idx,
task_name=task_name,
output_name=output_name,
output_path=output_path,
all_outputs=None,
all_inputs=None)
self._dag.add_node(n)
If this works, I will make a hot fix.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/kundajelab/chipseq_pipeline/issues/48?email_source=notifications&email_token=AHEKV7BBZ2NZXESMHAHOV3TQX3BAPA5CNFSM4JE7KTU2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEGK5Y6Q#issuecomment-563469434, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AHEKV7D2MUGS2SJDVYH3EI3QX3BAPANCNFSM4JE7KTUQ.
That sounds good.
Hi Jin,
I have one more question. I am wondering if you can help me to figure it out.
It seems that I did not activate the conda correctly. I am not very familiar with conda and wondering if you have any idea what should I do to achieve it? I am sorry to bother you that often and much and really appreciate all your help.
Best regards, Ying
On Dec 10, 2019, at 11:33 AM, Jin Lee notifications@github.com<mailto:notifications@github.com> wrote:
That sounds good.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/kundajelab/chipseq_pipeline/issues/48?email_source=notifications&email_token=AHEKV7CUUVBQ7P27S2QAUW3QX7VIHA5CNFSM4JE7KTU2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEGQSECQ#issuecomment-564208138, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AHEKV7FYKPQFIWXJL5VSYFDQX7VIHANCNFSM4JE7KTUQ.
It's Conda's issue. Conda has changed its activation method from source activate
to conda activate
. source activate
worked fine in a BASH script but conda activate
doesn't.
Got it! Thanks!
Best, Ying
On Dec 10, 2019, at 4:01 PM, Jin Lee notifications@github.com<mailto:notifications@github.com> wrote:
It's Conda's issue. Conda has changed its activation method from source activate to conda activate. source activate worked fine in a BASH script but conda activate doesn't.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/kundajelab/chipseq_pipeline/issues/48?email_source=notifications&email_token=AHEKV7BWADASUCEAX7MPYRDQYAUXPA5CNFSM4JE7KTU2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEGRMJMY#issuecomment-564315315, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AHEKV7BAJ4S4DZQFR4UBDMLQYAUXPANCNFSM4JE7KTUQ.
I am running chipseq with 9 replicates and 11 controls as
bds chipseq.bds -final_stage peak -no_pseudo_rep -peak_caller macs2 -species hg19 -nth 12 -out_dir test_hg19 -title B10 -fastq1 SRR8701918.fastq.gz -fastq2 SRR8701919.fastq.gz -fastq3 SRR8701920.fastq.gz -fastq4 SRR8701921.fastq.gz -fastq5 SRR8701922.fastq.gz -fastq6 SRR8701923.fastq.gz -fastq7 SRR8701925.fastq.gz -fastq8 SRR8701926.fastq.gz -fastq9 SRR8701924.fastq.gz -ctl_fastq1 SRR8702298.fastq.gz -ctl_fastq2 SRR8702299.fastq.gz -ctl_fastq3 SRR8702300.fastq.gz -ctl_fastq4 SRR8702301.fastq.gz -ctl_fastq5 SRR8702302.fastq.gz -ctl_fastq6 SRR8702303.fastq.gz -ctl_fastq7 SRR8702304.fastq.gz -ctl_fastq8 SRR8702305.fastq.gz -ctl_fastq9 SRR8702306.fastq.gz -ctl_fastq10 SRR8702307.fastq.gz -ctl_fastq11 SRR8702308.fastq.gz
1) The alignment seemed fine. But in the "align" folder, I can only see 2 controls as "ctl1 ctl2 pooled_ctl pooled_pseudo_reps pooled_rep pseudo_reps rep1 rep2 rep3 rep4 rep5 rep6 rep7 rep8 rep9". Were other controls pooled together? Or 2 controls are the most control chipseq.bds can take?
2) macs2 peak call seemed failed and I got a lot of error message "ERROR:root:--extsize must >= 1!"
Any suggestion will be appreciated. Thanks.