nservant / HiC-Pro

HiC-Pro: An optimized and flexible pipeline for Hi-C data processing
Other
382 stars 183 forks source link

error when running the hic_qc scripts #504

Closed yeroslaviz closed 2 years ago

yeroslaviz commented 2 years ago

Hi, before testing it on my own data I was trying to complete a run with your test data. After installing and downloading the testdata files, I started the run which went well up to the qc plotting step. Then I've got this error:

Run HiC-Pro 3.1.0
--------------------------------------------   
Wed Feb  2 16:07:31 CET 2022
Bowtie2 alignment step1 ...
Logs: logs/dixon_2M_2/mapping_step1.log
Logs: logs/dixon_2M/mapping_step1.log

--------------------------------------------   
Wed Feb  2 16:12:02 CET 2022
Bowtie2 alignment step2 ...
Logs: logs/dixon_2M_2/mapping_step2.log
Logs: logs/dixon_2M/mapping_step2.log

--------------------------------------------   
Wed Feb  2 16:15:37 CET 2022
Combine R1/R2 alignment files ...
Logs: logs/dixon_2M_2/mapping_combine.log
Logs: logs/dixon_2M/mapping_combine.log

--------------------------------------------   
Wed Feb  2 16:15:40 CET 2022
Mapping statistics for R1 and R2 tags ...
Logs: logs/dixon_2M_2/mapping_stats.log
Logs: logs/dixon_2M/mapping_stats.log

--------------------------------------------   
Wed Feb  2 16:15:42 CET 2022
Pairing of R1 and R2 tags ...
Logs: logs/dixon_2M_2/mergeSAM.log
Logs: logs/dixon_2M/mergeSAM.log

--------------------------------------------   
Wed Feb  2 16:15:51 CET 2022
Assign alignments to restriction fragments ... 
Logs: logs/dixon_2M_2/mapped_2hic_fragments.log
Logs: logs/dixon_2M/mapped_2hic_fragments.log  

--------------------------------------------   
Wed Feb  2 16:16:33 CET 2022
Merge chunks from the same sample ...
Logs: logs/dixon_2M/merge_valid_interactions.log
Logs: logs/dixon_2M_2/merge_valid_interactions.log

--------------------------------------------   
Wed Feb  2 16:16:34 CET 2022
Merge stat files per sample ...
Logs: logs/dixon_2M/merge_stats.log
Logs: logs/dixon_2M_2/merge_stats.log

--------------------------------------------   
Wed Feb  2 16:16:35 CET 2022
Run quality checks for all samples ...
Logs: logs/dixon_2M/make_Rplots.log
Logs: logs/dixon_2M_2/make_Rplots.log
make: *** [/fs/home/yeroslaviz/projects/HiC_Pro/HiC-Pro_3.1.0/bin/../scripts//Makefile:181: hic_qc] Error 1

I have found the error in the Rscript file plot_hic_contacts.Rout. In it it says:

...
> ## Histogram of insert size
> allvalidpairs <- list.files(path=hicDir, pattern=paste0("^[[:print:]]*\\.validPairs$"), full.names=TRUE)
> stats_per_validpairs<- lapply(allvalidpairs, read.csv, sep="\t", as.is=TRUE, header=FALSE, row.names=1, nrow=100000)
Error in read.table(file = file, header = header, sep = sep, quote = quote,  :
  no lines available in input
Calls: lapply -> FUN -> read.table
Execution halted

any ideas, why is this happening? Am I correct in my assumption, that the error occurs only with the second data set but not the first one?

thanks

Assa

yeroslaviz commented 2 years ago

Ok, I solved it. I didn't realized, that I'm using the genome from hg38, but the HindIII_resfrag file from hg19. This caused the error. After creating a new bed file for HindIII_resfrag_hg38, it ran smoothly.

raquelsofi commented 2 years ago

I have the same error but my genome and bed file are both hg38. Any other ideas what could be the cause?

nancylinzhen commented 1 year ago

I have the same error but my genome and bed file are both hg38. Any other ideas what could be the cause?

Have you solved this error? One of my data can run smoothly, but the other has the same problem as yours.