Closed Rayko87 closed 4 years ago
Hi Robbert,
Yes, two options, either you use the -s
options starting from the allValidPairs
file.
Or you can simple use the build_matrix
tools in scripts/
best
Thanks for your quick answer.
I am trying to do so, but I got an error:
/mnt/data/robert/ANALISIS/Virtual_4C_DND41$ HiC-Pro -i /mnt/data/robert/ANALISIS/HiC-Pro-DND41/Second_analysis_DND41_HiCpro/hic_results/data/sample1/sample1.allValidPairs -o DND41_5KB_chr8 -c config-hicpro.txt -s build_contact_maps
Exit: Error: Directory Hierarchy of rawdata '/mnt/data/robert/ANALISIS/HiC-Pro-DND41/Second_analysis_DND41_HiCpro/hic_results/data/sample1/sample1.allValidPairs' is not correct. No '.allValidPairs' files detected
This is my configuration file. with the -s option, should the data be included differently?
#########################################################################
#########################################################################
TMP_DIR = tmp LOGS_DIR = log BOWTIE2_OUTPUT_DIR = MAPC_OUTPUT = RAW_DIR =
#######################################################################
####################################################################### N_CPU = 11 LOGFILE = hicpro.log
JOB_NAME = JOB_MEM = JOB_WALLTIME = JOB_QUEUE = JOB_MAIL = #########################################################################
#########################################################################
PAIR1_EXT = _R1_001 PAIR2_EXT = _R2_001 #######################################################################
REFERENCE_GENOME = GRChg37-hg19 GENOME_SIZE = chrom_hg19.sizes
#######################################################################
#######################################################################
ALLELE_SPECIFIC_SNP =
#######################################################################
#######################################################################
CAPTURE_TARGET = REPORT_CAPTURE_REPORTER = 1
#######################################################################
#######################################################################
GENOME_FRAGMENT = /mnt/data/robert/ANALISIS/HiC-Pro-DND41/GATC_hg19 LIGATION_SITE = GATCGATC MIN_FRAG_SIZE = MAX_FRAG_SIZE = MIN_INSERT_SIZE = MAX_INSERT_SIZE =
#######################################################################
#######################################################################
MIN_CIS_DIST = GET_ALL_INTERACTION_CLASSES = 1 GET_PROCESS_SAM = 0 RM_SINGLETON = 1 RM_MULTI = 1 RM_DUP = 1
#######################################################################
#######################################################################
BIN_SIZE = 5000 MATRIX_FORMAT = upper
#######################################################################
####################################################################### MAX_ITER = 100 FILTER_LOW_COUNT_PERC = 0.02 FILTER_HIGH_COUNT_PERC = 0 EPS = 0.1
Thanks! #######################################################################
#######################################################################
MIN_MAPQ = 10
BOWTIE2_IDX_PATH =/mnt/data/robert/index_and_genome_files BOWTIE2_GLOBAL_OPTIONS = --very-sensitive -L 30 --score-min L,-0.6,-0.2 --end-to-end --reorder BOWTIE2_LOCAL_OPTIONS = --very-sensitive -L 20 --score-min L,-0.6,-0.2 --end-to-end --reorder
#######################################################################
#######################################################################
Even in stepwise mode, HiC-Pro expects to have a folder in input, with on subfolder per sample.
So in your case -i /mnt/data/robert/ANALISIS/HiC-Pro-DND41/Second_analysis_DND41_HiCpro/hic_results/data/
Best
Hello Sirvent,
Thanks again for your answers. Changing this generates a different error. Now I run as you suggestes:
HiC-Pro -i /mnt/data/robert/ANALISIS/HiC-Pro-DND41/Second_analysis_DND41_HiCpro/hic_results/data/ -o DND41_5KB_chr8 -c config-hicpro.txt -s build_contact_maps
Tue Mar 24 20:07:52 EDT 2020 Generate binned matrix files ... Exit: Error in input type.'.fastq|.bam|.validPairs|.allValidPairs|.matrix' files are expected. /opt/HiC-Pro/bin/../scripts//Makefile:171: recipe for target 'build_raw_maps' failed make: *** [build_raw_maps] Error 1
However, inside this difrectory /mnt/data/robert/ANALISIS/HiC-Pro-DND41/Second_analysis_DND41_HiCpro/hic_results/data/ is where I have the folder called "sample1" that contains all these files:
-rw-rw-r-- 1 robert robert 270M Mar 24 19:53 HiChip27-DND41_GRChg37-hg19.bwt2pairs.DEPairs -rw-rw-r-- 1 robert robert 405K Mar 24 19:53 HiChip27-DND41_GRChg37-hg19.bwt2pairs.DumpPairs -rw-rw-r-- 1 robert robert 0 Mar 24 19:53 HiChip27-DND41_GRChg37-hg19.bwt2pairs.FiltPairs -rw-rw-r-- 1 robert robert 129M Mar 24 19:53 HiChip27-DND41_GRChg37-hg19.bwt2pairs.REPairs -rw-rw-r-- 1 robert robert 328 Mar 24 19:53 HiChip27-DND41_GRChg37-hg19.bwt2pairs.RSstat -rw-rw-r-- 1 robert robert 239M Mar 24 19:53 HiChip27-DND41_GRChg37-hg19.bwt2pairs.SCPairs -rw-rw-r-- 1 robert robert 0 Mar 24 19:53 HiChip27-DND41_GRChg37-hg19.bwt2pairs.SinglePairs -rw-rw-r-- 1 robert robert 23G Mar 24 19:53 HiChip27-DND41_GRChg37-hg19.bwt2pairs.validPairs -rw-rw-r-- 1 robert robert 20G Mar 24 19:55 sample1.allValidPairs
Thanks again!
Robert
The first lines of the config file should not be edited !!
# Please change the variable settings below if necessary
#########################################################################
## Paths and Settings - Do not edit !
#########################################################################
TMP_DIR = tmp
LOGS_DIR = logs
BOWTIE2_OUTPUT_DIR = bowtie_results
MAPC_OUTPUT = hic_results
RAW_DIR = rawdata
Hello,
Thanks, I changed it but is still does not work. Now running this: HiC-Pro -i /mnt/data/robert/ANALISIS/HiC-Pro-DND41/Second_analysis_DND41_HiCpro/hic_results/data/ -o DND41_5Kb_chr8 -c config-hicpro.txt -s build_contact_maps
Run HiC-Pro 2.11.1 mkdir: missing operand Try 'mkdir --help' for more information. /opt/HiC-Pro/bin/../scripts//Makefile:75: recipe for target 'configure' failed make: *** [configure] Error 1
The program stops, creating a folder named DND41_5Kb_chr8 with a copy of the configure file and the data folder, with the sample1 subfolder containing all the HiCpro processed files (.allvalidpairs and so on). It doesn't matter where I run this, it stops generating these folders again.
This is again linked the RAW_DIR.
In theroy, it should do a mkdir $RAW_DIR
and if the variable is not set, it crashes.
Are you sure that your config is corrected ?
Hello again,
Sorry to bother you. You were right:my config file didn't have the correct rawfile line. However, running it now generates a new error
Wed Mar 25 14:47:02 EDT 2020 Generate binned matrix files ... Logs: logs/sample1/build_raw_maps.log sed: -e expression #1, char 8: unknown option to `s'
When I go to the log, it is empty with only one line saying:
I think the error comes from the scripts build_raw_maps.sh in the scripts folder. The point is that I do not understand what's going wrong. The error comes from the line 103. Could you try to edit the script, adding a few trace at line 102 ;
echo ${r}
N
Hello Servant,
Thu Mar 26 14:16:03 EDT 2020
Generate binned matrix files ...
Logs: logs/sample1/build_raw_maps.log
sed: -e expression #1, char 8: unknown option to s' sed: -e expression #1, char 8: unknown option to
s'
In fact, when I change build_raw_matrix script in the script folder:
ldir=${LOGS_DIR}/${RES_FILE_NAME}
mkdir -p ${ldir}
echo "Logs: ${ldir}/build_raw_maps.test.log"
echo "${BIN_SIZE} hELLO"
if [ -d ${DATA_DIR}/${RES_FILE_NAME} ]; then
MATRIX_DIR=${MAPC_OUTPUT}/matrix/${RES_FILE_NAME}/raw
for bsize in ${BIN_SIZE}
do
Changing the name ot the Logs (Adding a test particle), the result is still the same:
Thu Mar 26 14:19:18 EDT 2020
Generate binned matrix files ...
Logs: logs/sample1/build_raw_maps.log
sed: -e expression #1, char 8: unknown option to s' sed: -e expression #1, char 8: unknown option to
s'
Without the "test" word or anything like that....
Moreover, there is a file that it is not a .sh file in the scripts file called build_matrix. Is it OK this file?
It's a bit difficult to help you like this.
Last option, you can directly use the build_matrix
tools (without .sh)
This is the tool that generate the maps.
./build_matrix
./build_matrix: missing --binsize or --binfile option
usage: ./build_matrix --binsize BINSIZE|--binfile --chrsizes FILE --ifile FILE
--oprefix PREFIX [--binadjust] [--step STEP] [--binoffset OFFSET]
[--matrix-format asis|upper|lower|complete][--chrA CHR... --chrB CHR...] [--quiet] [--progress] [--detail-progress]
The input file is your allValidPairs
file. Specify the BIN_SIZE, the CHROMOSOME file (with the size of the chromosome), --matrix-format upper
and here you go !
Thanks Servant!
I don't know what is happening with the first script, but your solution worked perfectly. With the build_matrix and the ice scripts I got the results I needed!
Hi @nservant, I try to use the /apps/hicpro/2.10.0/scripts/build_matrix command to get different resolution. But, unfortunately I'm getting this error: terminate called after throwing an instance of 'std::logic_error' what(): basic_string::_S_construct null not valid Aborted (core dumped)
command:#/apps/hicpro/2.10.0/scripts/build_matrix --binsize 5000 --chrsizes chr.sizes --ifile sample1_allValidPairs.gz --matrix-format upper --oprefix sample1
Could you please help me to sort out this issue. Thanks
Hello Servant,
More than an issue, it is a question about HiCPro. Since getting the valid pairs is the most time-consuming process, is there a way to get the contact matrix with different kb resolution from the validpairs file?
Thanks, Robert