nservant / HiC-Pro

HiC-Pro: An optimized and flexible pipeline for Hi-C data processing
Other
382 stars 183 forks source link

Running Hi-C pro yields no results #400

Closed yamzaleg closed 3 years ago

yamzaleg commented 3 years ago

Hello,

I'm trying to run perform a HiChIP analysis -which takes principles of ChIP and Hi-C to determine interactions between genomic loci. I'm trying to run Hi-C Pro in published data to make sure I know how to perform the analysis before proceeding to mine. Basically I've made sure to have the fastq files in a particular directory (each individual sample has it's own directory within the major fastq directory). I've also filled out the config-hicpro.txt. When I try to run the command I see that in my output directory I now have a symbolic link to my raw data (called rawdata) and a new configuration file (with the same name). Nothing seems to be running and I'm stuck. Can you help me out?

the code I'm running: /home1/amzaleg/new/amzaleg/hichip_tools/HiC-Pro-2.11.4/bin/HiC-Pro -i ~/new/amzaleg/hichip/raw_data -o ~/new/amzaleg/hichip/run/new_output -c ~/new/amzaleg/hichip/run/config-hicpro.txt -s mapping -s proc_hic

I'll also attach my configuration file config-hicpro.txt

nservant commented 3 years ago

Hi, Everything looks good. Could you show me the content of your raw_data folder please ? N

yamzaleg commented 3 years ago

Yes, of course.

amzaleg@discovery2:~/new/amzaleg/hichip$ tree raw_data raw_data ├── GM_rep1 │   ├── GM_HiChIP_H3K27Ac_rep1_1.fastq.gz │   └── GM_HiChIP_H3K27Ac_rep1_2.fastq.gz ├── GM_rep2 │   ├── GM_HiChIP_H3K27Ac_rep2_1.fastq.gz │   └── GM_HiChIP_H3K27Ac_rep2_2.fastq.gz ├── MyLa_rep1 │   ├── MyLa_HiChIP_H3K27Ac_rep1_1.fastq.gz │   └── MyLa_HiChIP_H3K27Ac_rep1_2.fastq.gz └── MyLa_rep2 ├── MyLa_HiChIP_H3K27Ac_rep2_1.fastq.gz └── MyLa_HiChIP_H3K27Ac_rep2_2.fastq.gz

nservant commented 3 years ago

ok. So update the config file with ;

PAIR1_EXT=_1
PAIR2_EXT=_2

These options allow to detect R1 and R2 files using a single regexp. Best

yamzaleg commented 3 years ago

Thank you so much. Unfortunately, I am now getting another error: Run HiC-Pro 2.11.4

Wed Jan 27 00:44:36 PST 2021 Bowtie2 alignment step1 ... Logs: logs/GM_rep1/mapping_step1.log [main_samview] fail to read the header from "-". [main_samview] fail to read the header from "-". Exit: Error in reads alignment - Exit make: *** [bowtie_global] Error 1

nservant commented 3 years ago

could you check if you have an error message in logs/GM_rep1/mapping_step1.log please ?

nservant commented 3 years ago

you did not provide the bowtie2 index in your conf ! Please set up the BOWTIE2_IDX_PATH when the path to bowtie2 indexes. As you put hg19 in the reference genome, indexes must be named with the hg19 prefix. Otherwise, update the reference genome.

yamzaleg commented 3 years ago

thank you so much for your prompt responses! I noticed my error and adjusted the config file to include path to the indices of bowtie2. For some reason I keep getting an error that the directory with my indices is not found.

I'll send you my updated config file as well as how the directory with my hg19 bowtie2 indices look. config-hicpro_1.txt Screen Shot 2021-01-27 at 1 57 59 AM

nservant commented 3 years ago

I know that's a bit unclear but actually bowtie2 indexes are detected by a concatenation of BOWTIE2_INDEX_PATH and REFERENCE_GENOME ... So in your case, please use ;

BOWTIE2_IDX_PATH = ~/new/amzaleg/chipseq/hg19/hg19_bt2/
REFERENCE_GENOME = hg19_bt2
yamzaleg commented 3 years ago

Thank you so much! It seems to be running now. I appreciate all your help.

yamzaleg commented 3 years ago

Hello! I was able to run the code, and my University uses Slurm, based on the paper I thought using 10 Gb of memory per processor core spit into 10 tasks for 20 hours. Unfortunately, it ran through all 20 hours and it didn't complete the first fastq file. Should I ask for more memory? Am I going about this correctly? Below is my slurm script:

!/bin/bash

SBATCH --ntasks=10

SBATCH --mem-per-cpu=10GB

SBATCH --time=20:00:00

/home1/amzaleg/new/amzaleg/hichip_tools/HiC-Pro-2.11.4/bin/HiC-Pro -i ~/new/amzaleg/hichip/raw_data -o ~/new/amzaleg/hichip/run/new_output -c ~/new/amzaleg/hichip/run/config-hicpro.txt -s mapping -s proc_hic

nservant commented 3 years ago

Hi, If you want to speed up the processing, you should ;

In this case, all chunks will be processed in parallel and then merged before building the contact maps. The number of cores that you set up in the config is not extremely useful, as they will only impact the bowtie2 mapping .... Most of the other analysis steps are not multi-threarded.

Best

yamzaleg commented 3 years ago

Just a quick clarification: I was able to do the split to 10 million chucks for every fastq file. When you say "put all the chunks in the same output folder" I put split files for both pairs of the fastq files for each condition in a separate directory. I then ran the parallel command directing the input files to where the directory of all split files are.

Whe I ran the parallel code I got the both .sh files to run via SLURM. I'm just concerned that I'm doing something wrong because the split files look like this:

├── GM_rep1 │   ├── 00_GM_HiChIP_H3K27Ac_rep1_1.fastq │   ├── 00_GM_HiChIP_H3K27Ac_rep1_2.fastq │   ├── 01_GM_HiChIP_H3K27Ac_rep1_1.fastq │   ├── 01_GM_HiChIP_H3K27Ac_rep1_2.fastq │   ├── 02_GM_HiChIP_H3K27Ac_rep1_1.fastq │   ├── 02_GM_HiChIP_H3K27Ac_rep1_2.fastq │   ├── 03_GM_HiChIP_H3K27Ac_rep1_1.fastq │   ├── 03_GM_HiChIP_H3K27Ac_rep1_2.fastq │   ├── 04_GM_HiChIP_H3K27Ac_rep1_1.fastq │   ├── 04_GM_HiChIP_H3K27Ac_rep1_2.fastq │   ├── 05_GM_HiChIP_H3K27Ac_rep1_1.fastq │   ├── 05_GM_HiChIP_H3K27Ac_rep1_2.fastq │   ├── 06_GM_HiChIP_H3K27Ac_rep1_1.fastq │   ├── 06_GM_HiChIP_H3K27Ac_rep1_2.fastq │   ├── 07_GM_HiChIP_H3K27Ac_rep1_1.fastq │   ├── 07_GM_HiChIP_H3K27Ac_rep1_2.fastq │   ├── 08_GM_HiChIP_H3K27Ac_rep1_1.fastq │   ├── 08_GM_HiChIP_H3K27Ac_rep1_2.fastq │   ├── 09_GM_HiChIP_H3K27Ac_rep1_1.fastq │   ├── 09_GM_HiChIP_H3K27Ac_rep1_2.fastq │   ├── 10_GM_HiChIP_H3K27Ac_rep1_1.fastq │   ├── 10_GM_HiChIP_H3K27Ac_rep1_2.fastq │   ├── 11_GM_HiChIP_H3K27Ac_rep1_1.fastq │   ├── 11_GM_HiChIP_H3K27Ac_rep1_2.fastq │   ├── 12_GM_HiChIP_H3K27Ac_rep1_1.fastq │   ├── 12_GM_HiChIP_H3K27Ac_rep1_2.fastq │   ├── 13_GM_HiChIP_H3K27Ac_rep1_1.fastq │   ├── 13_GM_HiChIP_H3K27Ac_rep1_2.fastq │   ├── 14_GM_HiChIP_H3K27Ac_rep1_1.fastq │   ├── 14_GM_HiChIP_H3K27Ac_rep1_2.fastq │   ├── 15_GM_HiChIP_H3K27Ac_rep1_1.fastq │   ├── 15_GM_HiChIP_H3K27Ac_rep1_2.fastq │   ├── 16_GM_HiChIP_H3K27Ac_rep1_1.fastq │   ├── 16_GM_HiChIP_H3K27Ac_rep1_2.fastq │   ├── 17_GM_HiChIP_H3K27Ac_rep1_1.fastq │   ├── 17_GM_HiChIP_H3K27Ac_rep1_2.fastq │   ├── 18_GM_HiChIP_H3K27Ac_rep1_1.fastq │   ├── 18_GM_HiChIP_H3K27Ac_rep1_2.fastq │   ├── 19_GM_HiChIP_H3K27Ac_rep1_1.fastq │   ├── 19_GM_HiChIP_H3K27Ac_rep1_2.fastq │   ├── 20_GM_HiChIP_H3K27Ac_rep1_1.fastq │   ├── 20_GM_HiChIP_H3K27Ac_rep1_2.fastq │   ├── 21_GM_HiChIP_H3K27Ac_rep1_1.fastq │   ├── 21_GM_HiChIP_H3K27Ac_rep1_2.fastq │   ├── 22_GM_HiChIP_H3K27Ac_rep1_1.fastq │   ├── 22_GM_HiChIP_H3K27Ac_rep1_2.fastq │   ├── 23_GM_HiChIP_H3K27Ac_rep1_1.fastq │   ├── 23_GM_HiChIP_H3K27Ac_rep1_2.fastq │   ├── 24_GM_HiChIP_H3K27Ac_rep1_1.fastq │   ├── 24_GM_HiChIP_H3K27Ac_rep1_2.fastq │   ├── 25_GM_HiChIP_H3K27Ac_rep1_1.fastq │   ├── 25_GM_HiChIP_H3K27Ac_rep1_2.fastq │   ├── 26_GM_HiChIP_H3K27Ac_rep1_1.fastq │   ├── 26_GM_HiChIP_H3K27Ac_rep1_2.fastq │   ├── 27_GM_HiChIP_H3K27Ac_rep1_1.fastq │   ├── 27_GM_HiChIP_H3K27Ac_rep1_2.fastq │   ├── 28_GM_HiChIP_H3K27Ac_rep1_1.fastq │   ├── 28_GM_HiChIP_H3K27Ac_rep1_2.fastq │   ├── 29_GM_HiChIP_H3K27Ac_rep1_1.fastq │   ├── 29_GM_HiChIP_H3K27Ac_rep1_2.fastq │   ├── 30_GM_HiChIP_H3K27Ac_rep1_1.fastq │   ├── 30_GM_HiChIP_H3K27Ac_rep1_2.fastq │   ├── 31_GM_HiChIP_H3K27Ac_rep1_1.fastq │   ├── 31_GM_HiChIP_H3K27Ac_rep1_2.fastq │   ├── 32_GM_HiChIP_H3K27Ac_rep1_1.fastq │   ├── 32_GM_HiChIP_H3K27Ac_rep1_2.fastq │   ├── 33_GM_HiChIP_H3K27Ac_rep1_1.fastq │   └── 33_GM_HiChIP_H3K27Ac_rep1_2.fastq

After I ran the part 1 .sh file yielding this result :

├── 00_GM_HiChIP_H3K27Ac_rep1_1_hg19_bt2.bwt2glob.bam ├── 00_GM_HiChIP_H3K27Ac_rep1_1_hg19_bt2.bwt2glob.unmap.fastq ├── 00_GM_HiChIP_H3K27Ac_rep1_2_hg19_bt2.bwt2glob.bam ├── 00_GM_HiChIP_H3K27Ac_rep1_2_hg19_bt2.bwt2glob.unmap.fastq ├── 01_GM_HiChIP_H3K27Ac_rep1_1_hg19_bt2.bwt2glob.bam ├── 01_GM_HiChIP_H3K27Ac_rep1_1_hg19_bt2.bwt2glob.unmap.fastq ├── 01_GM_HiChIP_H3K27Ac_rep1_2_hg19_bt2.bwt2glob.bam ├── 01_GM_HiChIP_H3K27Ac_rep1_2_hg19_bt2.bwt2glob.unmap.fastq ├── 02_GM_HiChIP_H3K27Ac_rep1_1_hg19_bt2.bwt2glob.bam ├── 02_GM_HiChIP_H3K27Ac_rep1_1_hg19_bt2.bwt2glob.unmap.fastq ├── 02_GM_HiChIP_H3K27Ac_rep1_2_hg19_bt2.bwt2glob.bam ├── 02_GM_HiChIP_H3K27Ac_rep1_2_hg19_bt2.bwt2glob.unmap.fastq ├── 03_GM_HiChIP_H3K27Ac_rep1_1_hg19_bt2.bwt2glob.bam ├── 03_GM_HiChIP_H3K27Ac_rep1_1_hg19_bt2.bwt2glob.unmap.fastq ├── 03_GM_HiChIP_H3K27Ac_rep1_2_hg19_bt2.bwt2glob.bam ├── 03_GM_HiChIP_H3K27Ac_rep1_2_hg19_bt2.bwt2glob.unmap.fastq ├── 04_GM_HiChIP_H3K27Ac_rep1_1_hg19_bt2.bwt2glob.bam ├── 04_GM_HiChIP_H3K27Ac_rep1_1_hg19_bt2.bwt2glob.unmap.fastq ├── 04_GM_HiChIP_H3K27Ac_rep1_2_hg19_bt2.bwt2glob.bam ├── 04_GM_HiChIP_H3K27Ac_rep1_2_hg19_bt2.bwt2glob.unmap.fastq ├── 05_GM_HiChIP_H3K27Ac_rep1_1_hg19_bt2.bwt2glob.bam ├── 05_GM_HiChIP_H3K27Ac_rep1_1_hg19_bt2.bwt2glob.unmap.fastq ├── 05_GM_HiChIP_H3K27Ac_rep1_2_hg19_bt2.bwt2glob.bam ├── 05_GM_HiChIP_H3K27Ac_rep1_2_hg19_bt2.bwt2glob.unmap.fastq ├── 06_GM_HiChIP_H3K27Ac_rep1_1_hg19_bt2.bwt2glob.bam ├── 06_GM_HiChIP_H3K27Ac_rep1_1_hg19_bt2.bwt2glob.unmap.fastq ├── 06_GM_HiChIP_H3K27Ac_rep1_2_hg19_bt2.bwt2glob.bam ├── 06_GM_HiChIP_H3K27Ac_rep1_2_hg19_bt2.bwt2glob.unmap.fastq ├── 07_GM_HiChIP_H3K27Ac_rep1_1_hg19_bt2.bwt2glob.bam ├── 07_GM_HiChIP_H3K27Ac_rep1_1_hg19_bt2.bwt2glob.unmap.fastq ├── 07_GM_HiChIP_H3K27Ac_rep1_2_hg19_bt2.bwt2glob.bam ├── 07_GM_HiChIP_H3K27Ac_rep1_2_hg19_bt2.bwt2glob.unmap.fastq ├── 08_GM_HiChIP_H3K27Ac_rep1_1_hg19_bt2.bwt2glob.bam ├── 08_GM_HiChIP_H3K27Ac_rep1_1_hg19_bt2.bwt2glob.unmap.fastq ├── 08_GM_HiChIP_H3K27Ac_rep1_2_hg19_bt2.bwt2glob.bam ├── 08_GM_HiChIP_H3K27Ac_rep1_2_hg19_bt2.bwt2glob.unmap.fastq ├── 09_GM_HiChIP_H3K27Ac_rep1_1_hg19_bt2.bwt2glob.bam ├── 09_GM_HiChIP_H3K27Ac_rep1_1_hg19_bt2.bwt2glob.unmap.fastq ├── 09_GM_HiChIP_H3K27Ac_rep1_2_hg19_bt2.bwt2glob.bam ├── 09_GM_HiChIP_H3K27Ac_rep1_2_hg19_bt2.bwt2glob.unmap.fastq ├── 10_GM_HiChIP_H3K27Ac_rep1_1_hg19_bt2.bwt2glob.bam ├── 10_GM_HiChIP_H3K27Ac_rep1_1_hg19_bt2.bwt2glob.unmap.fastq ├── 10_GM_HiChIP_H3K27Ac_rep1_2_hg19_bt2.bwt2glob.bam ├── 10_GM_HiChIP_H3K27Ac_rep1_2_hg19_bt2.bwt2glob.unmap.fastq ├── 11_GM_HiChIP_H3K27Ac_rep1_1_hg19_bt2.bwt2glob.bam ├── 11_GM_HiChIP_H3K27Ac_rep1_1_hg19_bt2.bwt2glob.unmap.fastq ├── 11_GM_HiChIP_H3K27Ac_rep1_2_hg19_bt2.bwt2glob.bam ├── 11_GM_HiChIP_H3K27Ac_rep1_2_hg19_bt2.bwt2glob.unmap.fastq ├── 12_GM_HiChIP_H3K27Ac_rep1_1_hg19_bt2.bwt2glob.bam ├── 12_GM_HiChIP_H3K27Ac_rep1_1_hg19_bt2.bwt2glob.unmap.fastq ├── 12_GM_HiChIP_H3K27Ac_rep1_2_hg19_bt2.bwt2glob.bam ├── 12_GM_HiChIP_H3K27Ac_rep1_2_hg19_bt2.bwt2glob.unmap.fastq ├── 13_GM_HiChIP_H3K27Ac_rep1_1_hg19_bt2.bwt2glob.bam ├── 13_GM_HiChIP_H3K27Ac_rep1_1_hg19_bt2.bwt2glob.unmap.fastq ├── 13_GM_HiChIP_H3K27Ac_rep1_2_hg19_bt2.bwt2glob.bam ├── 13_GM_HiChIP_H3K27Ac_rep1_2_hg19_bt2.bwt2glob.unmap.fastq ├── 14_GM_HiChIP_H3K27Ac_rep1_1_hg19_bt2.bwt2glob.bam ├── 14_GM_HiChIP_H3K27Ac_rep1_1_hg19_bt2.bwt2glob.unmap.fastq ├── 14_GM_HiChIP_H3K27Ac_rep1_2_hg19_bt2.bwt2glob.bam ├── 14_GM_HiChIP_H3K27Ac_rep1_2_hg19_bt2.bwt2glob.unmap.fastq ├── 15_GM_HiChIP_H3K27Ac_rep1_1_hg19_bt2.bwt2glob.bam ├── 15_GM_HiChIP_H3K27Ac_rep1_1_hg19_bt2.bwt2glob.unmap.fastq ├── 15_GM_HiChIP_H3K27Ac_rep1_2_hg19_bt2.bwt2glob.bam ├── 15_GM_HiChIP_H3K27Ac_rep1_2_hg19_bt2.bwt2glob.unmap.fastq ├── 16_GM_HiChIP_H3K27Ac_rep1_1_hg19_bt2.bwt2glob.bam ├── 16_GM_HiChIP_H3K27Ac_rep1_1_hg19_bt2.bwt2glob.unmap.fastq ├── 16_GM_HiChIP_H3K27Ac_rep1_2_hg19_bt2.bwt2glob.bam ├── 16_GM_HiChIP_H3K27Ac_rep1_2_hg19_bt2.bwt2glob.unmap.fastq ├── 17_GM_HiChIP_H3K27Ac_rep1_1_hg19_bt2.bwt2glob.bam ├── 17_GM_HiChIP_H3K27Ac_rep1_1_hg19_bt2.bwt2glob.unmap.fastq ├── 17_GM_HiChIP_H3K27Ac_rep1_2_hg19_bt2.bwt2glob.bam ├── 17_GM_HiChIP_H3K27Ac_rep1_2_hg19_bt2.bwt2glob.unmap.fastq ├── 18_GM_HiChIP_H3K27Ac_rep1_1_hg19_bt2.bwt2glob.bam ├── 18_GM_HiChIP_H3K27Ac_rep1_1_hg19_bt2.bwt2glob.unmap.fastq ├── 18_GM_HiChIP_H3K27Ac_rep1_2_hg19_bt2.bwt2glob.bam ├── 18_GM_HiChIP_H3K27Ac_rep1_2_hg19_bt2.bwt2glob.unmap.fastq ├── 19_GM_HiChIP_H3K27Ac_rep1_1_hg19_bt2.bwt2glob.bam ├── 19_GM_HiChIP_H3K27Ac_rep1_1_hg19_bt2.bwt2glob.unmap.fastq ├── 19_GM_HiChIP_H3K27Ac_rep1_2_hg19_bt2.bwt2glob.bam ├── 19_GM_HiChIP_H3K27Ac_rep1_2_hg19_bt2.bwt2glob.unmap.fastq ├── 20_GM_HiChIP_H3K27Ac_rep1_1_hg19_bt2.bwt2glob.bam ├── 20_GM_HiChIP_H3K27Ac_rep1_1_hg19_bt2.bwt2glob.unmap.fastq ├── 20_GM_HiChIP_H3K27Ac_rep1_2_hg19_bt2.bwt2glob.bam ├── 20_GM_HiChIP_H3K27Ac_rep1_2_hg19_bt2.bwt2glob.unmap.fastq ├── 21_GM_HiChIP_H3K27Ac_rep1_1_hg19_bt2.bwt2glob.bam ├── 21_GM_HiChIP_H3K27Ac_rep1_1_hg19_bt2.bwt2glob.unmap.fastq ├── 21_GM_HiChIP_H3K27Ac_rep1_2_hg19_bt2.bwt2glob.bam ├── 21_GM_HiChIP_H3K27Ac_rep1_2_hg19_bt2.bwt2glob.unmap.fastq ├── 22_GM_HiChIP_H3K27Ac_rep1_1_hg19_bt2.bwt2glob.bam ├── 22_GM_HiChIP_H3K27Ac_rep1_1_hg19_bt2.bwt2glob.unmap.fastq ├── 22_GM_HiChIP_H3K27Ac_rep1_2_hg19_bt2.bwt2glob.bam ├── 22_GM_HiChIP_H3K27Ac_rep1_2_hg19_bt2.bwt2glob.unmap.fastq ├── 23_GM_HiChIP_H3K27Ac_rep1_1_hg19_bt2.bwt2glob.bam ├── 23_GM_HiChIP_H3K27Ac_rep1_1_hg19_bt2.bwt2glob.unmap.fastq ├── 23_GM_HiChIP_H3K27Ac_rep1_2_hg19_bt2.bwt2glob.bam ├── 23_GM_HiChIP_H3K27Ac_rep1_2_hg19_bt2.bwt2glob.unmap.fastq ├── 24_GM_HiChIP_H3K27Ac_rep1_1_hg19_bt2.bwt2glob.bam ├── 24_GM_HiChIP_H3K27Ac_rep1_1_hg19_bt2.bwt2glob.unmap.fastq ├── 24_GM_HiChIP_H3K27Ac_rep1_2_hg19_bt2.bwt2glob.bam ├── 24_GM_HiChIP_H3K27Ac_rep1_2_hg19_bt2.bwt2glob.unmap.fastq ├── 25_GM_HiChIP_H3K27Ac_rep1_1_hg19_bt2.bwt2glob.bam ├── 25_GM_HiChIP_H3K27Ac_rep1_1_hg19_bt2.bwt2glob.unmap.fastq ├── 25_GM_HiChIP_H3K27Ac_rep1_2_hg19_bt2.bwt2glob.bam ├── 25_GM_HiChIP_H3K27Ac_rep1_2_hg19_bt2.bwt2glob.unmap.fastq ├── 26_GM_HiChIP_H3K27Ac_rep1_1_hg19_bt2.bwt2glob.bam ├── 26_GM_HiChIP_H3K27Ac_rep1_1_hg19_bt2.bwt2glob.unmap.fastq ├── 26_GM_HiChIP_H3K27Ac_rep1_2_hg19_bt2.bwt2glob.bam ├── 26_GM_HiChIP_H3K27Ac_rep1_2_hg19_bt2.bwt2glob.unmap.fastq ├── 27_GM_HiChIP_H3K27Ac_rep1_1_hg19_bt2.bwt2glob.bam ├── 27_GM_HiChIP_H3K27Ac_rep1_1_hg19_bt2.bwt2glob.unmap.fastq ├── 27_GM_HiChIP_H3K27Ac_rep1_2_hg19_bt2.bwt2glob.bam ├── 27_GM_HiChIP_H3K27Ac_rep1_2_hg19_bt2.bwt2glob.unmap.fastq ├── 28_GM_HiChIP_H3K27Ac_rep1_1_hg19_bt2.bwt2glob.bam ├── 28_GM_HiChIP_H3K27Ac_rep1_1_hg19_bt2.bwt2glob.unmap.fastq ├── 28_GM_HiChIP_H3K27Ac_rep1_2_hg19_bt2.bwt2glob.bam ├── 28_GM_HiChIP_H3K27Ac_rep1_2_hg19_bt2.bwt2glob.unmap.fastq ├── 29_GM_HiChIP_H3K27Ac_rep1_1_hg19_bt2.bwt2glob.bam ├── 29_GM_HiChIP_H3K27Ac_rep1_1_hg19_bt2.bwt2glob.unmap.fastq ├── 29_GM_HiChIP_H3K27Ac_rep1_2_hg19_bt2.bwt2glob.bam ├── 29_GM_HiChIP_H3K27Ac_rep1_2_hg19_bt2.bwt2glob.unmap.fastq ├── 30_GM_HiChIP_H3K27Ac_rep1_1_hg19_bt2.bwt2glob.bam ├── 30_GM_HiChIP_H3K27Ac_rep1_1_hg19_bt2.bwt2glob.unmap.fastq ├── 30_GM_HiChIP_H3K27Ac_rep1_2_hg19_bt2.bwt2glob.bam ├── 30_GM_HiChIP_H3K27Ac_rep1_2_hg19_bt2.bwt2glob.unmap.fastq ├── 31_GM_HiChIP_H3K27Ac_rep1_1_hg19_bt2.bwt2glob.bam ├── 31_GM_HiChIP_H3K27Ac_rep1_1_hg19_bt2.bwt2glob.unmap.fastq ├── 31_GM_HiChIP_H3K27Ac_rep1_2_hg19_bt2.bwt2glob.bam ├── 31_GM_HiChIP_H3K27Ac_rep1_2_hg19_bt2.bwt2glob.unmap.fastq ├── 32_GM_HiChIP_H3K27Ac_rep1_1_hg19_bt2.bwt2glob.bam ├── 32_GM_HiChIP_H3K27Ac_rep1_1_hg19_bt2.bwt2glob.unmap.fastq ├── 32_GM_HiChIP_H3K27Ac_rep1_2_hg19_bt2.bwt2glob.bam ├── 32_GM_HiChIP_H3K27Ac_rep1_2_hg19_bt2.bwt2glob.unmap.fastq ├── 33_GM_HiChIP_H3K27Ac_rep1_1_hg19_bt2.bwt2glob.bam ├── 33_GM_HiChIP_H3K27Ac_rep1_1_hg19_bt2.bwt2glob.unmap.fastq ├── 33_GM_HiChIP_H3K27Ac_rep1_2_hg19_bt2.bwt2glob.bam └── 33_GM_HiChIP_H3K27Ac_rep1_2_hg19_bt2.bwt2glob.unmap.fastq

with some of those files being empty. Is this what you had in mind?

nservant commented 3 years ago

Hi The input data looks ok, but indeed, you shouldn't have empty file in the end. Anything error in the log folder ? Do you have a bwt2 folder ?

yamzaleg commented 3 years ago

Hi, I do have a logs folder, but I don't see any errors (there are many files in there for each replicate directory). Should I look for one in particular? I also see the bwt2 folder and it looks like there are no empty files there, but for some reason one of my replicates for one of the samples doesn't have all the split files.

In MyLa_rep2 there should be split files from 0-21, but only 10-21 are in the btw2 folder.

yamzaleg commented 3 years ago

So I checked the logs for that replicate looking at one of the chunks that were missing and this was the error in the file

HiC-Pro mapping

Error reading block of _offs[] array: 8188, 716196308Error Reading File! Error: Encountered internal Bowtie 2 exception (#1) Command: /home1/amzaleg/new/amzaleg/bin/bowtie2-align-s --wrapper basic-0 --very-sensitive -L 30 --score-min L,-0.6,-0.2 --end-to-end --reorder --rg-id BMG --rg SM:00_MyLa_HiChIP_H3K27Ac_rep2_2 -p 5 -x /home1/amzaleg/new/amzaleg/chipseq/hg19/hg19_bt2//hg19_bt2 --passthrough -U rawdata/MyLa_rep2/00_MyLa_HiChIP_H3K27Ac_rep2_2.fastq (ERR): bowtie2-align exited with value 1

nservant commented 3 years ago

never seen that before ! it seems to be an internal bowtie2 error ...

yamzaleg commented 3 years ago

That's weird as it didn't happen to any of my other samples (or even all the chunks for that replicate)! I'll do some investigating. Thank you for your help!

nservant commented 3 years ago

http://seqanswers.com/forums/showthread.php?t=5318

"These types of errors occur when the files are genuinely either corrupt or incomplete (e.g. if the disk becomes exhausted during the index-building process). Can you send detailed output from one example where this happens, including a 'ls -l' on the index files after bowtie-build completes?"

nservant commented 3 years ago

Maybe you can check if the files which crashed are complete (reads/qualities) or try to realign one of them manually, to see if you can reproduce the error

ProfH2SO4 commented 1 month ago

I know that's a bit unclear but actually bowtie2 indexes are detected by a concatenation of BOWTIE2_INDEX_PATH and REFERENCE_GENOME ... So in your case, please use ;

BOWTIE2_IDX_PATH = ~/new/amzaleg/chipseq/hg19/hg19_bt2/
REFERENCE_GENOME = hg19_bt2

It works! Thank you very much :)