broadinstitute / ABC-Enhancer-Gene-Prediction

Cell type specific enhancer-gene predictions using ABC model (Fulco, Nasser et al, Nature Genetics 2019)
MIT License
202 stars 61 forks source link

How to get .KRobserved files from .hic file type? #53

Closed atmaivancevic closed 11 months ago

atmaivancevic commented 3 years ago

Hi,

I'm having issues whenever I try to use a .hic file type to compute powerlaw fit. From the github, I thought I could feed in just a .hic file, but compute_powerlaw_fit_from_hic.py seems to expect .KRobserved files too? I've attached some example input and errors below.

First, I tried downloading public Hi-C data and fitting a powerlaw distrubtion, e.g.:

cd HiC_from_HCT116-RAD21/
wget https://4dn-open-data-public.s3.amazonaws.com/fourfront-webprod/wfoutput/4a1b58d3-9ae6-43e4-91cf-49f1fcbbab33/4DNFIYWONU7A.hic
cd ..

python ABC-Enhancer-Gene-Prediction/src/compute_powerlaw_fit_from_hic.py \
--hicDir HiC_from_HCT116-RAD21/ \
--outDir HiC_from_HCT116-RAD21/powerlaw/ \
--maxWindow 1000000 \
--minWindow 5000 \
--resolution 5000

But it failed with the following error: [Errno 2] No such file or directory: 'HiC_from_HCT116-RAD21/chr1/chr1.KRobserved.gz'

I get the same error when I try to download the file suggested on the github, i.e.:

python ABC-Enhancer-Gene-Prediction/src/juicebox_dump.py \
--hic_file https://hicfiles.s3.amazonaws.com/hiseq/k562/in-situ/combined_30.hic \
--juicebox "java -jar juicer_tools.jar" \
--outdir testHic/ \
--chromosomes 1

gives this error:

Starting chr1 ... 
java -jar juicer_tools.jar dump observed KR https://hicfiles.s3.amazonaws.com/hiseq/k562/in-situ/combined_30.hic 1 1 BP 5000 testHic//chr1//chr1.KRobserved
Running command: gzip testHic//chr1//chr1.KRobserved
gzip: testHic//chr1//chr1.KRobserved: No such file or directory
subprocess.CalledProcessError: Command 'gzip testHic//chr1//chr1.KRobserved' returned non-zero exit status 1.

Am I missing something? Do we need to generate the *.KRobserved files ourselves?

Any help would be great, many thanks in advance!

lindaboshans commented 3 years ago

You can generate the *.KRobserved files by using juicebox_dump.py in the src folder

python juicebox_dump_VC.py --hic_file path_to_hic/.hic --juicebox "java -jar juicer_tools_1.22.01.jar" --outdir --chromosomes $chrom

jialuqian commented 3 years ago

python juicebox_dump_VC.py --hic_file path_to_hic/.hic --juicebox "java -jar juicer_tools_1.22.01.jar" --outdir --chromosomes $chrom I'm sorry, I made the same mistake and it still says no file after using this command line

捕获

Any help would be great, many thanks in advance!

bhoellbacher commented 3 years ago

Make sure you have juicer_tools.jar downloaded and adjust the path to the file location if necessary.
Try e.g.


python juicebox_dump.py \
      --hic_file path_to_hic/.hic \
      --juicebox "java -jar path_to_jar/juicer_tools.jar" \
      --outdir my_outdir \
      --chromosomes 22
atancoder commented 11 months ago

We've revamped the codebase. Please check out https://github.com/broadinstitute/ABC-Enhancer-Gene-Prediction/tree/main and reopen your issue if it still exists