deeptools / HiCExplorer

HiCExplorer is a powerful and easy to use set of tools to process, normalize and visualize Hi-C data.
https://hicexplorer.readthedocs.org
GNU General Public License v3.0
233 stars 70 forks source link

Error in hicBuildmatrix #349

Closed abhisheksinghnl closed 5 years ago

abhisheksinghnl commented 5 years ago

Hi,

I have aligned my reads using bwa and have reordered them as well using picard tools.

Also, I have removed PCR duplicates and reads with MAPQ <5 (are these causing the error?).

when I execute the hicBuildmatrix like following

hicBuildMatrix --samFiles Re.PD.Mapq.MD.Sort.Reorder_log1.bam Re.PD.Mapq.MD.Sort.Reorder_log2.bam --outBam HiC.T.R1.bam --restrictionCutFile rest_site_positions.bed -o hic_matrix.h5 --minMappingQuality 5 reading Re.PD.Mapq.MD.Sort.Reorder_log1.bam and Re.PD.Mapq.MD.Sort.Reorder_log2.bam to build hic_matrix Minimum distance considered between restriction sites is 300 Max distance: 800 Matrix size: 11659518

Traceback (most recent call last): File "/home/zzeeo/.conda/envs/hicexplorer/bin/hicBuildMatrix", line 4, in import('pkg_resources').run_script('HiCExplorer==1.3', 'hicBuildMatrix') File "/home/zzeeo/.conda/envs/hicexplorer/lib/python2.7/site-packages/pkg_resources/init.py", line 664, in run_script self.require(requires)[0].run_script(script_name, ns) File "/home/zzeeo/.conda/envs/hicexplorer/lib/python2.7/site-packages/pkg_resources/init.py", line 1444, in run_script exec(code, namespace, namespace) File "/home/zzeeo/.conda/envs/hicexplorer/lib/python2.7/site-packages/HiCExplorer-1.3-py2.7.egg-info/scripts/hicBuildMatrix", line 7, in main() File "/home/zzeeo/.conda/envs/hicexplorer/lib/python2.7/site-packages/hicexplorer/hicBuildMatrix.py", line 644, in main "the --reorder option".format(mate1.qname, mate2.qname) AssertionError: FATAL ERROR A00387:60:GW181017000:1:1103:19877:20290 A00387:60:GW181017000:1:1216:26838:31375 Be sure that the sam files have the same read order If using Bowtie2 or Hisat2 add the --reorder option

I get following error. I don't understand. What is the problem.

Could you please shed some light.

Thank you.

joachimwolff commented 5 years ago

Hi,

as described in our documentation you have to use the --reorder option for hisat and bowtie, but BWA does not require this. I don't know what picard tools does for reordering and I cannot give you support for this tool.

Best,

Joachim

abhisheksinghnl commented 5 years ago

Hi,

I just relaunched the job without any filters (removal of PCR duplicates and MAPQ) it runs fine. I guess, the culprit is picard or some settings.

BTW if I have mapped using bwa, how should I remove PCR duplicates? In my opinion, I should mark duplicates using Picard and then remove them using samtools.

I want to do this as a data preprocessing step.

bgruening commented 5 years ago

@abhisheksinghnl if you remove PCRs make sure you remove the entire pair, that is both sequences from both fastq files. The sorting of both BAM files needs to be the same so that the pairs can be processed in order.

joachimwolff commented 5 years ago

hicBuildMatrix takes care of PCR duplicates and low quality reads. With the parameter minMappingQuality you can define the quality level.

gtrichard commented 5 years ago

Your mapping and hicMatrix building should like this (taken from Snakepipes):

https://github.com/maxplanck-ie/snakepipes/tree/master

findRestSite -f genome.fasta --searchPattern GATC -o DpnII.bed
bwa mem -A1 -B4  -E50 -L0 -t 15 genome_bwa_index FASTQs/sample1_R1.fastq | samtools view -Shb - > BAMs/sample1_R1.bam
bwa mem -A1 -B4  -E50 -L0 -t 15 genome_bwa_index FASTQs/sample1_R2.fastq | samtools view -Shb - > BAMs/sample1_R2.bam
hicBuildMatrix -s BAMs/sample1_R1.bam BAMs/sample1_R2.bam -rs  DpnII.bed --restrictionSequence GATC --danglingSequence GATC --minDistance 150 --maxDistance 1000 --QCfolder HiC_matrices/QCs --threads 10 -o HiC_matrices/sample1.h5

This should create a restriction fragment resolution matrix. As @joachimwolff said you can use --minMappingQuality in hicBuildMatrix to filter reads.

LeilyR commented 5 years ago

Follow up to what @gtrichard mentioned, you could alternatively use HiC pipeline from snakepipes

abhisheksinghnl commented 5 years ago

Here is the output of previous command:

$ conda create -c bioconda --name hicexplorer hicexplorer
WARNING: A conda environment already exists at '/tools/eb/software/Miniconda3/4.4.10/envs/hicexplorer'
Remove existing environment (y/[n])? y

Collecting package metadata: done
Solving environment: done

## Package Plan ##

  environment location: /tools/eb/software/Miniconda3/4.4.10/envs/hicexplorer

  added / updated specs:
    - hicexplorer

The following NEW packages will be INSTALLED:

  backports          pkgs/main/linux-64::backports-1.0-py27_1
  backports.functoo~ pkgs/main/linux-64::backports.functools_lru_cache-1.5-py27_1
  backports_abc      pkgs/main/linux-64::backports_abc-0.5-py27_0
  bcftools           bioconda/linux-64::bcftools-1.9-h47928c2_1
  biopython          bioconda/linux-64::biopython-1.68-py27_0
  blas               pkgs/main/linux-64::blas-1.0-mkl
  blosc              pkgs/main/linux-64::blosc-1.15.0-hd408876_0
  bx-python          bioconda/linux-64::bx-python-0.7.3-py27_0
  bzip2              pkgs/main/linux-64::bzip2-1.0.6-h14c3975_5
  ca-certificates    pkgs/main/linux-64::ca-certificates-2019.1.23-0
  certifi            pkgs/main/linux-64::certifi-2018.11.29-py27_0
  curl               pkgs/main/linux-64::curl-7.63.0-hbc83047_1000
  cycler             pkgs/main/linux-64::cycler-0.10.0-py27_0
  dbus               pkgs/main/linux-64::dbus-1.13.6-h746ee38_0
  expat              pkgs/main/linux-64::expat-2.2.6-he6710b0_0
  fontconfig         pkgs/main/linux-64::fontconfig-2.13.0-h9420a91_0
  freetype           pkgs/main/linux-64::freetype-2.9.1-h8a8886c_1
  functools32        pkgs/main/linux-64::functools32-3.2.3.2-py27_1
  futures            pkgs/main/linux-64::futures-3.2.0-py27_0
  glib               pkgs/main/linux-64::glib-2.56.2-hd408876_0
  gst-plugins-base   pkgs/main/linux-64::gst-plugins-base-1.14.0-hbbd80ab_1
  gstreamer          pkgs/main/linux-64::gstreamer-1.14.0-hb453b48_1
  hdf5               pkgs/main/linux-64::hdf5-1.10.4-hb1b8bf9_0
  hicexplorer        bioconda/linux-64::hicexplorer-1.3-py27_0
  htslib             bioconda/linux-64::htslib-1.9-h47928c2_5
  icu                pkgs/main/linux-64::icu-58.2-h9c2bf20_1
  intel-openmp       pkgs/main/linux-64::intel-openmp-2019.1-144
  jpeg               pkgs/main/linux-64::jpeg-9b-h024ee3a_2
  kiwisolver         pkgs/main/linux-64::kiwisolver-1.0.1-py27hf484d3e_0
  krb5               pkgs/main/linux-64::krb5-1.16.1-h173b8e3_7
  libcurl            pkgs/main/linux-64::libcurl-7.63.0-h20c2e04_1000
  libdeflate         bioconda/linux-64::libdeflate-1.0-h14c3975_1
  libedit            pkgs/main/linux-64::libedit-3.1.20181209-hc058e9b_0
  libffi             pkgs/main/linux-64::libffi-3.2.1-hd88cf55_4
  libgcc-ng          pkgs/main/linux-64::libgcc-ng-8.2.0-hdf63c60_1
  libgfortran-ng     pkgs/main/linux-64::libgfortran-ng-7.3.0-hdf63c60_0
  libpng             pkgs/main/linux-64::libpng-1.6.36-hbc83047_0
  libssh2            pkgs/main/linux-64::libssh2-1.8.0-h1ba5d50_4
  libstdcxx-ng       pkgs/main/linux-64::libstdcxx-ng-8.2.0-hdf63c60_1
  libtiff            pkgs/main/linux-64::libtiff-4.0.10-h2733197_2
  libuuid            pkgs/main/linux-64::libuuid-1.0.3-h1bed415_2
  libxcb             pkgs/main/linux-64::libxcb-1.13-h1bed415_1
  libxml2            pkgs/main/linux-64::libxml2-2.9.9-he19cac6_0
  lzo                pkgs/main/linux-64::lzo-2.10-h49e0be7_2
  matplotlib         pkgs/main/linux-64::matplotlib-2.2.3-py27hb69df0a_0
  mkl                pkgs/main/linux-64::mkl-2019.1-144
  mkl_fft            pkgs/main/linux-64::mkl_fft-1.0.10-py27ha843d7b_0
  mkl_random         pkgs/main/linux-64::mkl_random-1.0.2-py27hd81dba3_0
  mmtf-python        bioconda/linux-64::mmtf-python-1.0.2-py27_0
  msgpack-python     pkgs/main/linux-64::msgpack-python-0.6.1-py27hfd86e86_1
  ncurses            pkgs/main/linux-64::ncurses-6.1-he6710b0_1
  numexpr            pkgs/main/linux-64::numexpr-2.6.9-py27h9e4a6bb_0
  numpy              pkgs/main/linux-64::numpy-1.15.4-py27h7e9f1db_0
  numpy-base         pkgs/main/linux-64::numpy-base-1.15.4-py27hde5b4d6_0
  olefile            pkgs/main/linux-64::olefile-0.46-py27_0
  openssl            pkgs/main/linux-64::openssl-1.1.1a-h7b6447c_0
  pcre               pkgs/main/linux-64::pcre-8.42-h439df22_0
  pillow             pkgs/main/linux-64::pillow-5.4.1-py27h34e0f95_0
  pip                pkgs/main/linux-64::pip-19.0.1-py27_0
  pybigwig           bioconda/linux-64::pybigwig-0.3.13-py27hdfb72b2_0
  pyparsing          pkgs/main/linux-64::pyparsing-2.3.1-py27_0
  pyqt               pkgs/main/linux-64::pyqt-5.9.2-py27h05f1152_2
  pysam              bioconda/linux-64::pysam-0.15.2-py27h1671916_1
  pytables           pkgs/main/linux-64::pytables-3.4.4-py27h71ec239_0
  python             pkgs/main/linux-64::python-2.7.15-h9bab390_6
  python-dateutil    pkgs/main/linux-64::python-dateutil-2.7.5-py27_0
  pytz               pkgs/main/linux-64::pytz-2018.9-py27_0
  qt                 pkgs/main/linux-64::qt-5.9.7-h5867ecd_1
  readline           pkgs/main/linux-64::readline-7.0-h7b6447c_5
  reportlab          pkgs/main/linux-64::reportlab-3.5.13-py27he686d34_0
  samtools           bioconda/linux-64::samtools-1.9-h91753b0_5
  scipy              pkgs/main/linux-64::scipy-1.2.0-py27h7c811a0_0
  setuptools         pkgs/main/linux-64::setuptools-40.8.0-py27_0
  singledispatch     pkgs/main/linux-64::singledispatch-3.4.0.3-py27_0
  sip                pkgs/main/linux-64::sip-4.19.8-py27hf484d3e_0
  six                pkgs/main/linux-64::six-1.12.0-py27_0
  snappy             pkgs/main/linux-64::snappy-1.1.7-hbae5bb6_3
  sqlite             pkgs/main/linux-64::sqlite-3.26.0-h7b6447c_0
  subprocess32       pkgs/main/linux-64::subprocess32-3.5.3-py27h7b6447c_0
  tk                 pkgs/main/linux-64::tk-8.6.8-hbc83047_0
  tornado            pkgs/main/linux-64::tornado-5.1.1-py27h7b6447c_0
  wheel              pkgs/main/linux-64::wheel-0.32.3-py27_0
  xz                 pkgs/main/linux-64::xz-5.2.4-h14c3975_4
  zlib               pkgs/main/linux-64::zlib-1.2.11-h7b6447c_3
  zstd               pkgs/main/linux-64::zstd-1.3.7-h0b5b093_0

Proceed ([y]/n)? y

Preparing transaction: done
Verifying transaction: done
Executing transaction: done
#
# To activate this environment, use
#
#     $ conda activate hicexplorer
#
# To deactivate an active environment, use
#
#     $ conda deactivate