databio / pepatac

A modular, containerized pipeline for ATAC-seq data processing
http://pepatac.databio.org
BSD 2-Clause "Simplified" License
54 stars 14 forks source link

The samples cannot be separated in the result file #271

Closed zhongzheng1999 closed 8 months ago

zhongzheng1999 commented 8 months ago

Hi, I am attempting to use the looper runp --looper-config examples/test_project/looper_test_refgenie.yaml functionality, but I encountered an error:

"FileNotFoundError: [Errno 2] No such file or directory: '/home/u2204084007/biosoft/pepatac-0.11.2/processed/results_pipeline/test1/stats.yaml'."

I have confirmed that I've configured the looper-config based on the examples you provided earlier. I have also uploaded the corresponding file. looper_test_refgenie.yam:

name: test_project pep_config: /home/u2204084007/biosoft/pepatac-0.11.2/examples/test_project/test_config_project_config.yaml

output_dir: "/home/u2204084007/biosoft/pepatac-0.11.2/processed/" pipeline_interfaces: sample: ["/home/u2204084007/biosoft/pepatac-0.11.2/sample_pipeline_interface.yaml"] project: ["/home/u2204084007/biosoft/pepatac-0.11.2/project_pipeline_interface.yaml"]

pipestat: results_file_path: "/home/u2204084007/biosoft/pepatac-0.11.2/processed/results_pipeline/{record_identifier}/stats.yaml"

test_config_project_config.yaml:

name: test_project pep_version: 2.0.0 sample_table: test_annotation.csv sample_modifiers: derive: attributes: [read1, read2] sources: test_data_R1: "/home/u2204084007/biosoft/pepatac-0.11.2/examples/data/{sample_name}_r1.fastq.gz" test_data_R2: "/home/u2204084007/biosoft/pepatac-0.11.2/examples/data/{sample_name}_r2.fastq.gz" imply:

  • if: organism: ["human", "Homo sapiens", "Human", "Homo_sapiens"] then: genome: hg38 prealignment_names: ["rCRSd"] deduplicator: samblaster trimmer: skewer
    peak_type: fixed
    extend: "250"
    frip_ref_peaks: None

It's worth noting that in the results folder, only the "summary" folder was generated, and individual folders for each sample were not created as expected.

test_annotation.csv PEPATAC_log.txt

donaldcampbelljr commented 8 months ago

Hi,

You must first use looper run to generate the stats.yaml results files for the individual samples before running the project level pipelines using looper runp.

So in this case:

looper run --looper-config examples/test_project/looper_test_refgenie.yaml

then

looper runp --looper-config examples/test_project/looper_test_refgenie.yaml

Finally, you can make an html report with both the sample level and the project level results using: looper report --looper-config examples/test_project/looper_test_refgenie.yaml

zhongzheng1999 commented 8 months ago

Later on, I encountered a new error, this time from ggplot in R, complaining about the absence of the 'linewidth' parameter. After upgrading the ggplot2 version, I was able to fix this issue. I have now obtained the complete run results. Thanks again for your assistance.