mervesa / HiCDCPlus

HiCDCPlus
15 stars 2 forks source link

Error with hicdcdiff function #8

Closed koushik20 closed 2 years ago

koushik20 commented 2 years ago

Firstly, thank you for the detailed documentation of HiCDCPlus package. I am trying to find differential interactions from my allvalidpairsdata. I was able to create an index file with the union of significant interactions but when proceeding to the next step to perform differential analysis with hicdcdiff function I am getting the following error.

Error in gi_list_read(path.expand(prefix)) : 
  Some of columns 'chrI','startI','chrJ','startJ','D' do not
            exist in file

Below is the code I am trying to execute

library(HiCDCPlus)
library(BSgenome.Hsapiens.UCSC.hg19)
library(DESeq2)

#Generate features
outdir <- "/mnt/Hi-ChIP_Demo/"
construct_features(output_path=paste0(outdir,"hg19_5kb_GATC"),
                   gen="Hsapiens",gen_ver="hg19",
                   sig="GATC",bin_type="Bins-uniform",
                   binsize=5000,
                   wg_file=NULL,
                   chrs=c("chr15"))

#Adding allValidPairs
hicfile_paths <- c("HiChIP_KURA_chr15_allValidPairs.txt",
                   "HiChIP_FT246_chr15_allValidPairs.txt")
indexfile <- data.frame()
for(hicfile_path in hicfile_paths){
  output_path <- outdir
  #generate gi_list instance
  gi_list<- generate_bintolen_gi_list(
    bintolen_path=paste0(outdir,"hg19_5kb_GATC_bintolen.txt.gz"),
    gen="Hsapiens",gen_ver="hg19")
  #add .hic counts
  #gi_list<-add_hic_counts(gi_list,hic_path = hicfile_path)
  gi_list <- add_hicpro_allvalidpairs_counts(gi_list, hicfile_paths[2])
  #expand features for modeling
  gi_list<-expand_1D_features(gi_list)
  #run HiC-DC+ on 2 cores
  set.seed(1010) #HiC-DC downsamples rows for modeling
  gi_list<-HiCDCPlus(gi_list,ssize=0.1)
  for (i in seq(length(gi_list))){
    indexfile<-unique(rbind(indexfile,
                            as.data.frame(gi_list[[i]][gi_list[[i]]$qvalue<=0.05])[c('seqnames1',
                                                                                     'start1','start2')]))
  }
  #write results to a text file
  gi_list_write(gi_list,fname=paste0(output_path, 'kura_chr15_int.txt'))
}
#save index file---union of significants b/w FT246 & Kuramochi in Chr15
colnames(indexfile)<-c('chr','startI','startJ')
data.table::fwrite(indexfile,
                   paste0(outdir,'/FT246&Kuramochi_chr15_interactions_analysis_indices.txt.gz'),
                   sep='\t',row.names=FALSE,quote=FALSE)
#Differential analysis using modified DESeq2 (hicdcdiff)
hicdcdiff(input_paths=list(FT246=paste0(outdir,'/HiChIP-FT246_merged_editF_allValidPairs.txt'),
                           Kuramochi=paste0(outdir,'/HiChIP-Kuramochi-merged_editF_allValidPairs.txt')),
          filter_file=paste0(outdir,'/FT246&Kuramochi_chr15_interactions_analysis_indices.txt'),
          output_path=paste0(outdir,'/diff_analysis_FT246&Kuramochi/'),
          bin_type = "Bins-uniform",
          fitType = 'local',
          binsize=5000,
          chrs = 'chr15',
          diagnostics = TRUE)

Please guide me. Thank you!

mervesa commented 2 years ago

Hi @koushik20 , thanks for using HiCDCPlus! hicdcdiff input_paths accepts either paths to written gi_list objects that you can save via gi_list_write (e.g., kura_chr15_int.txt) , or .hic files. It doesn't accept allValidPairs.txt files. Hope this helps.

mervesa commented 2 years ago

Closing due to inactivity.