XiaoTaoWang / HiC_pipeline

An easy-to-use Hi-C data processing software supporting distributed computation.
http://xiaotaowang.github.io/HiC_pipeline/index.html
GNU General Public License v3.0
53 stars 20 forks source link

Mapping returning zero results #9

Closed alobo4 closed 1 year ago

alobo4 commented 1 year ago

Hi,

I am attempting to run the HCC1954 Hi-C data through the HiC_pipeline and then further through the NeoLoopFinder pipeline as you did in your paper. I prefetch the data from GSM3258551, SRR7475914. I followed the steps on your quick-start page to get reference genome hg19, as mentioned on the GSM page. When I run the mapping portion of the pipeline, my returning chunked .pairsam.gz files have no paired reads. My datasets.tsv file consists of one line SRR7475914_pass HCC1954 R1 MboI This is the code i am attempting to run: runHiC mapping -p ../data/ -g hg19 -f HiC-FASTQ -F FASTQ -A bwa-mem -t 10 --include-readid --chunkSize 76500000 --drop-seq --logFile runHiC-mapping.log

My returning SRR7475914_pass_chunk0.pairsam.gz file returns this at the bottom with no paired reads #columns: readID chrom1 pos1 chrom2 pos2 strand1 strand2 pair_type, preventing me from moving on to the further filtering and binning steps.

Is there something I am missing within the setup to have the pipeline work correctly for this cell-line? Thank you so much!

XiaoTaoWang commented 1 year ago

Hi, that was weird ... I processed exactly the same datasets using runHiC and no errors occurred before. Since the first column of your "datasets.tsv" was named "SRR7475914_pass", I was wondering what preprocessing steps did you perform before runHiC mapping?

XiaoTaoWang commented 1 year ago

I have included all necessary steps in the documentation (http://xiaotaowang.github.io/HiC_pipeline/quickstart.html), and no additional preprocessing steps or setup are needed.

alobo4 commented 1 year ago

Hi,

Sorry for the late response, I have finally gotten around to looking at my inputed code and you were right! I accidentally added the "pass-filter" parameter when downloading the SRA file using SRA-Toolkit. Thank you so much for the suggestion!