Closed annashcherbina closed 4 years ago
Upgrading to 0.3.3 does not alter the error.
(encode-atac-seq-pipeline) annashch@caper:~$ pip show croo
Name: croo
Version: 0.3.3
Summary: CRomwell Output Organizer
Home-page: https://github.com/ENCODE-DCC/croo
Author: Jin Lee
Author-email: leepc12@gmail.com
License: UNKNOWN
Location: /opt/anaconda3/envs/encode-atac-seq-pipeline/lib/python3.7/site-packages
Requires: graphviz, caper
Required-by:
(encode-atac-seq-pipeline) annashch@caper:~$ croo --out-dir /data/croo --out-def atac.croo.json gs://caper_out/bpnet_anna/atac/d47e2acb-cd7b-4ce4-b342-b9bbbe3e54fe/metadata.json
[CaperURI] copying from gcs to local, src: gs://caper_out/bpnet_anna/atac/d47e2acb-cd7b-4ce4-b342-b9bbbe3e54fe/metadata.json
[CaperURI] copying skipped, target: /data/croo/.croo_tmp/caper_out/bpnet_anna/atac/d47e2acb-cd7b-4ce4-b342-b9bbbe3e54fe/metadata.json
Traceback (most recent call last):
File "/opt/anaconda3/envs/encode-atac-seq-pipeline/bin/croo", line 13, in <module>
main()
File "/opt/anaconda3/envs/encode-atac-seq-pipeline/lib/python3.7/site-packages/croo/croo.py", line 304, in main
no_graph=args['no_graph'])
File "/opt/anaconda3/envs/encode-atac-seq-pipeline/lib/python3.7/site-packages/croo/croo.py", line 53, in __init__
self._cm = CromwellMetadata(self._metadata)
File "/opt/anaconda3/envs/encode-atac-seq-pipeline/lib/python3.7/site-packages/croo/cromwell_metadata.py", line 100, in __init__
self.__parse_calls(self._metadata_json['calls'])
File "/opt/anaconda3/envs/encode-atac-seq-pipeline/lib/python3.7/site-packages/croo/cromwell_metadata.py", line 183, in __parse_calls
for output_name, output_path, _ in out_files:
TypeError: 'NoneType' object is not iterable
You need to use the latest output def JSON file --out-def-json
. Use atac.croo.v2.json
in ATAC-seq pipeline's git directory.
I'm getting the same problem with the CHiP-seq pipeline. Is this also because of the chip.croo.json file?
Attached is my metadata json file. Could it be a problem with this file?
@leepc12 I'm using the v2 json file, and the error does not change.
@ychsiao1: Did you try with the latest Croo? Check its version croo -v
.
@leepc12 My croo version is 0.3.3
Pipeline's Conda environment has its own Croo. You may have two Croos (one in Conda env and the other pip-installed).
Activate pipeline's Conda env and check Croo's version in it.
After activating conda env (conda activate encode-chip-seq-pipeline), croo -v is still 0.3.3
@annashcherbina Can you edit line 183~193 of /opt/anaconda3/envs/encode-atac-seq-pipeline/lib/python3.7/site-packages/croo/cromwell_metadata.py
like the following (just adding an if statement if out_files:
)
if out_files:
for output_name, output_path, _ in out_files:
# add each output file to DAG
n = CMNode(
type='output',
shard_idx=shard_idx,
task_name=task_name,
output_name=output_name,
output_path=output_path,
all_outputs=None,
all_inputs=None)
self._dag.add_node(n)
@leepc12 -- this edit gets rid of the error for both versions for both v1 and v2 json file. (just switching to v2 of the json didn't work for me either, so this edit was needed). Croo completed without errors with the addition of the if-statement to check if out_files was defined.
Can you clarify why "out_files" would be None in some cases and not others? I ran 6 samples in an identical fashion, and only this one had this issue (all pipeline/croo/input json versions same for the 6 samples). Does this indicate an issue with the pipeline outputs?
Thanks for your help!
That can happen when there is a task without an actual File
output. There is no such task for ENCODE pipelines but if the metadata.son
was not updated (this is a known issue of Caper, it sometimes fail to update metadata.son
on the output directory cromwell-executions/
) so some tasks (and the whole workflow) are still marked as Running
and there are of course no File
outputs for such task.
I will make a new release today.
The pipeline runs successfully now. Thanks!
As a side note, are the two example jsons written to skip peak calling? I don't see peak calling outputs when I run the examples. If I wanted to do peak calling, I would have to add that task into the json files?
@ychsiao1 What is your pipeline version?
@leepc12 I git cloned to encode pipeline yesterday, so it should be the newest version (encode-chip-seq-pipeline2). As for caper and croo, I'm using versions 0.6.3 and 0.3.4 respectively.
@ychsiao1 : can you upload your metadata.json
and Croo's HTML report?
@leepc12 Here are the two files
@ychsiao1 : This metadata.json
is from a failed workflow. Workflow's status is marked as "Failed". I think this failed before peak-calling steps. Try with metadata.json
from a succeeded workflow.
@leepc12 Is there a particular reason why the workflow failed when creating that metadata.json file? I'm using the example inputs provided within the cloned directory without making any changes. (ENCSR936XTK_subsampled_chr19_only.json)
metadata.json
is like an output log of a workflow. It seems like your test run with ENCSR936XTK_subsampled_chr19_only.json
failed somehow.
Please post an issue on pipeline's github repo then I will take a look. Follow the bug reporting instruction there.
Error:
Version:
Command: