databio / pepatac

A modular, containerized pipeline for ATAC-seq data processing
http://pepatac.databio.org
BSD 2-Clause "Simplified" License
54 stars 15 forks source link

HMMRATAC as peak caller - Peaks folder output #180

Closed PratimaNL closed 3 years ago

PratimaNL commented 3 years ago

Hello,

We're running PEPATAC using HMMRATAC as peak caller.

Is it expected to see "HMMRATAC failed to identify any peaks." in the log file and an empty .narrowPeak file, because HMMRATAC instead generates the .gappedPeak and summits.bed? Also, attached a screenshot of the peaks folder for one of the samples, are these the expected output?

Screen Shot 2021-05-17 at 10 49 25 AM

Thank you for your help, Pratima

jpsmith5 commented 3 years ago

Hi @pratimanl, yes I've recently uncovered some bugs with changes to the pipeline that are now fixed in the development branch. You can check that out or if you wait a day, I'm close to releasing those updates that should address this for you.

PratimaNL commented 3 years ago

Thank you @jpsmith5. Updating our pepatac version is a pending task, so for now we are continuing with pepatac version 0.9.10, and would like to see if the following issue is really a java memory issue or something we need to troubleshoot within pepatac code:

Snippet of log file of one of the samples which generated fewer than expected output files:

Target to produce: ..._peaks.narrowPeak

java -jar .../software/hmmratac/1.2.10/HMMRATAC.jar --bam ..._fixed_header.bam --index ..._fixed_header.bam.bai --genome .../peak_calling_Sscrofa11_1/chr_order.txt --output .../peak_calling_Sscrofa11_1/P452-2_peaks.narrowPeak (99376)

     

Exception in thread "main" java.lang.OutOfMemoryError: Java heap space |   |     | at java.base/java.util.Arrays.copyOf(Arrays.java:3688) |   |     | at java.base/java.util.ArrayList.grow(ArrayList.java:237) |   |     | at java.base/java.util.ArrayList.grow(ArrayList.java:242) |   |     | at java.base/java.util.ArrayList.add(ArrayList.java:467) |   |     | at java.base/java.util.ArrayList.add(ArrayList.java:480) |   |     | at FormatConverters.PileupToBedGraph.toMap(PileupToBedGraph.java:114) |   |     | at FormatConverters.PileupToBedGraph.run(PileupToBedGraph.java:50) |   |     | at FormatConverters.PileupToBedGraph.<init>(PileupToBedGraph.java:41) |   |     | at WigMath.pileup.toBedGraph(pileup.java:217) |   |     | at WigMath.pileup.build(pileup.java:206) |   |     | at WigMath.pileup.<init>(pileup.java:71) |   |     | at HMMR_ATAC.Main_HMMR_Driver.main(Main_HMMR_Driver.java:270)

We're currently trying change to java usage - something like "java -Xms1g -Xmx7g -jar ..." in the pepatac.py file.

Do you have any input on how to go about this? Or is this part of the list of fixes you've already made?

Thank you for your help

jpsmith5 commented 3 years ago

Hey @pratimanl, I've just released a version that should ease this process. You can specify additional java based parameters in the pepatac.yaml configuration file under the java_settings: header. Any step that requires a java call would append these settings to the java call itself.

Similarly, there's also the built in JAVA_TOOL_OPTIONS environment variable you could set to modify your java settings system wide.