Closed abhisheksinghnl closed 5 years ago
Hi,
as described in our documentation you have to use the --reorder option for hisat and bowtie, but BWA does not require this. I don't know what picard tools does for reordering and I cannot give you support for this tool.
Best,
Joachim
Hi,
I just relaunched the job without any filters (removal of PCR duplicates and MAPQ) it runs fine. I guess, the culprit is picard or some settings.
BTW if I have mapped using bwa, how should I remove PCR duplicates? In my opinion, I should mark duplicates using Picard and then remove them using samtools.
I want to do this as a data preprocessing step.
@abhisheksinghnl if you remove PCRs make sure you remove the entire pair, that is both sequences from both fastq files. The sorting of both BAM files needs to be the same so that the pairs can be processed in order.
hicBuildMatrix takes care of PCR duplicates and low quality reads. With the parameter minMappingQuality
you can define the quality level.
Your mapping and hicMatrix building should like this (taken from Snakepipes):
https://github.com/maxplanck-ie/snakepipes/tree/master
findRestSite -f genome.fasta --searchPattern GATC -o DpnII.bed
bwa mem -A1 -B4 -E50 -L0 -t 15 genome_bwa_index FASTQs/sample1_R1.fastq | samtools view -Shb - > BAMs/sample1_R1.bam
bwa mem -A1 -B4 -E50 -L0 -t 15 genome_bwa_index FASTQs/sample1_R2.fastq | samtools view -Shb - > BAMs/sample1_R2.bam
hicBuildMatrix -s BAMs/sample1_R1.bam BAMs/sample1_R2.bam -rs DpnII.bed --restrictionSequence GATC --danglingSequence GATC --minDistance 150 --maxDistance 1000 --QCfolder HiC_matrices/QCs --threads 10 -o HiC_matrices/sample1.h5
This should create a restriction fragment resolution matrix. As @joachimwolff said you can use --minMappingQuality
in hicBuildMatrix
to filter reads.
Follow up to what @gtrichard mentioned, you could alternatively use HiC
pipeline from snakepipes
Here is the output of previous command:
$ conda create -c bioconda --name hicexplorer hicexplorer
WARNING: A conda environment already exists at '/tools/eb/software/Miniconda3/4.4.10/envs/hicexplorer'
Remove existing environment (y/[n])? y
Collecting package metadata: done
Solving environment: done
## Package Plan ##
environment location: /tools/eb/software/Miniconda3/4.4.10/envs/hicexplorer
added / updated specs:
- hicexplorer
The following NEW packages will be INSTALLED:
backports pkgs/main/linux-64::backports-1.0-py27_1
backports.functoo~ pkgs/main/linux-64::backports.functools_lru_cache-1.5-py27_1
backports_abc pkgs/main/linux-64::backports_abc-0.5-py27_0
bcftools bioconda/linux-64::bcftools-1.9-h47928c2_1
biopython bioconda/linux-64::biopython-1.68-py27_0
blas pkgs/main/linux-64::blas-1.0-mkl
blosc pkgs/main/linux-64::blosc-1.15.0-hd408876_0
bx-python bioconda/linux-64::bx-python-0.7.3-py27_0
bzip2 pkgs/main/linux-64::bzip2-1.0.6-h14c3975_5
ca-certificates pkgs/main/linux-64::ca-certificates-2019.1.23-0
certifi pkgs/main/linux-64::certifi-2018.11.29-py27_0
curl pkgs/main/linux-64::curl-7.63.0-hbc83047_1000
cycler pkgs/main/linux-64::cycler-0.10.0-py27_0
dbus pkgs/main/linux-64::dbus-1.13.6-h746ee38_0
expat pkgs/main/linux-64::expat-2.2.6-he6710b0_0
fontconfig pkgs/main/linux-64::fontconfig-2.13.0-h9420a91_0
freetype pkgs/main/linux-64::freetype-2.9.1-h8a8886c_1
functools32 pkgs/main/linux-64::functools32-3.2.3.2-py27_1
futures pkgs/main/linux-64::futures-3.2.0-py27_0
glib pkgs/main/linux-64::glib-2.56.2-hd408876_0
gst-plugins-base pkgs/main/linux-64::gst-plugins-base-1.14.0-hbbd80ab_1
gstreamer pkgs/main/linux-64::gstreamer-1.14.0-hb453b48_1
hdf5 pkgs/main/linux-64::hdf5-1.10.4-hb1b8bf9_0
hicexplorer bioconda/linux-64::hicexplorer-1.3-py27_0
htslib bioconda/linux-64::htslib-1.9-h47928c2_5
icu pkgs/main/linux-64::icu-58.2-h9c2bf20_1
intel-openmp pkgs/main/linux-64::intel-openmp-2019.1-144
jpeg pkgs/main/linux-64::jpeg-9b-h024ee3a_2
kiwisolver pkgs/main/linux-64::kiwisolver-1.0.1-py27hf484d3e_0
krb5 pkgs/main/linux-64::krb5-1.16.1-h173b8e3_7
libcurl pkgs/main/linux-64::libcurl-7.63.0-h20c2e04_1000
libdeflate bioconda/linux-64::libdeflate-1.0-h14c3975_1
libedit pkgs/main/linux-64::libedit-3.1.20181209-hc058e9b_0
libffi pkgs/main/linux-64::libffi-3.2.1-hd88cf55_4
libgcc-ng pkgs/main/linux-64::libgcc-ng-8.2.0-hdf63c60_1
libgfortran-ng pkgs/main/linux-64::libgfortran-ng-7.3.0-hdf63c60_0
libpng pkgs/main/linux-64::libpng-1.6.36-hbc83047_0
libssh2 pkgs/main/linux-64::libssh2-1.8.0-h1ba5d50_4
libstdcxx-ng pkgs/main/linux-64::libstdcxx-ng-8.2.0-hdf63c60_1
libtiff pkgs/main/linux-64::libtiff-4.0.10-h2733197_2
libuuid pkgs/main/linux-64::libuuid-1.0.3-h1bed415_2
libxcb pkgs/main/linux-64::libxcb-1.13-h1bed415_1
libxml2 pkgs/main/linux-64::libxml2-2.9.9-he19cac6_0
lzo pkgs/main/linux-64::lzo-2.10-h49e0be7_2
matplotlib pkgs/main/linux-64::matplotlib-2.2.3-py27hb69df0a_0
mkl pkgs/main/linux-64::mkl-2019.1-144
mkl_fft pkgs/main/linux-64::mkl_fft-1.0.10-py27ha843d7b_0
mkl_random pkgs/main/linux-64::mkl_random-1.0.2-py27hd81dba3_0
mmtf-python bioconda/linux-64::mmtf-python-1.0.2-py27_0
msgpack-python pkgs/main/linux-64::msgpack-python-0.6.1-py27hfd86e86_1
ncurses pkgs/main/linux-64::ncurses-6.1-he6710b0_1
numexpr pkgs/main/linux-64::numexpr-2.6.9-py27h9e4a6bb_0
numpy pkgs/main/linux-64::numpy-1.15.4-py27h7e9f1db_0
numpy-base pkgs/main/linux-64::numpy-base-1.15.4-py27hde5b4d6_0
olefile pkgs/main/linux-64::olefile-0.46-py27_0
openssl pkgs/main/linux-64::openssl-1.1.1a-h7b6447c_0
pcre pkgs/main/linux-64::pcre-8.42-h439df22_0
pillow pkgs/main/linux-64::pillow-5.4.1-py27h34e0f95_0
pip pkgs/main/linux-64::pip-19.0.1-py27_0
pybigwig bioconda/linux-64::pybigwig-0.3.13-py27hdfb72b2_0
pyparsing pkgs/main/linux-64::pyparsing-2.3.1-py27_0
pyqt pkgs/main/linux-64::pyqt-5.9.2-py27h05f1152_2
pysam bioconda/linux-64::pysam-0.15.2-py27h1671916_1
pytables pkgs/main/linux-64::pytables-3.4.4-py27h71ec239_0
python pkgs/main/linux-64::python-2.7.15-h9bab390_6
python-dateutil pkgs/main/linux-64::python-dateutil-2.7.5-py27_0
pytz pkgs/main/linux-64::pytz-2018.9-py27_0
qt pkgs/main/linux-64::qt-5.9.7-h5867ecd_1
readline pkgs/main/linux-64::readline-7.0-h7b6447c_5
reportlab pkgs/main/linux-64::reportlab-3.5.13-py27he686d34_0
samtools bioconda/linux-64::samtools-1.9-h91753b0_5
scipy pkgs/main/linux-64::scipy-1.2.0-py27h7c811a0_0
setuptools pkgs/main/linux-64::setuptools-40.8.0-py27_0
singledispatch pkgs/main/linux-64::singledispatch-3.4.0.3-py27_0
sip pkgs/main/linux-64::sip-4.19.8-py27hf484d3e_0
six pkgs/main/linux-64::six-1.12.0-py27_0
snappy pkgs/main/linux-64::snappy-1.1.7-hbae5bb6_3
sqlite pkgs/main/linux-64::sqlite-3.26.0-h7b6447c_0
subprocess32 pkgs/main/linux-64::subprocess32-3.5.3-py27h7b6447c_0
tk pkgs/main/linux-64::tk-8.6.8-hbc83047_0
tornado pkgs/main/linux-64::tornado-5.1.1-py27h7b6447c_0
wheel pkgs/main/linux-64::wheel-0.32.3-py27_0
xz pkgs/main/linux-64::xz-5.2.4-h14c3975_4
zlib pkgs/main/linux-64::zlib-1.2.11-h7b6447c_3
zstd pkgs/main/linux-64::zstd-1.3.7-h0b5b093_0
Proceed ([y]/n)? y
Preparing transaction: done
Verifying transaction: done
Executing transaction: done
#
# To activate this environment, use
#
# $ conda activate hicexplorer
#
# To deactivate an active environment, use
#
# $ conda deactivate
Hi,
I have aligned my reads using bwa and have reordered them as well using picard tools.
Also, I have removed PCR duplicates and reads with MAPQ <5 (are these causing the error?).
when I execute the hicBuildmatrix like following
hicBuildMatrix --samFiles Re.PD.Mapq.MD.Sort.Reorder_log1.bam Re.PD.Mapq.MD.Sort.Reorder_log2.bam --outBam HiC.T.R1.bam --restrictionCutFile rest_site_positions.bed -o hic_matrix.h5 --minMappingQuality 5
reading Re.PD.Mapq.MD.Sort.Reorder_log1.bam and Re.PD.Mapq.MD.Sort.Reorder_log2.bam to build hic_matrix Minimum distance considered between restriction sites is 300 Max distance: 800 Matrix size: 11659518Traceback (most recent call last): File "/home/zzeeo/.conda/envs/hicexplorer/bin/hicBuildMatrix", line 4, in
import('pkg_resources').run_script('HiCExplorer==1.3', 'hicBuildMatrix')
File "/home/zzeeo/.conda/envs/hicexplorer/lib/python2.7/site-packages/pkg_resources/init.py", line 664, in run_script
self.require(requires)[0].run_script(script_name, ns)
File "/home/zzeeo/.conda/envs/hicexplorer/lib/python2.7/site-packages/pkg_resources/init.py", line 1444, in run_script
exec(code, namespace, namespace)
File "/home/zzeeo/.conda/envs/hicexplorer/lib/python2.7/site-packages/HiCExplorer-1.3-py2.7.egg-info/scripts/hicBuildMatrix", line 7, in
main()
File "/home/zzeeo/.conda/envs/hicexplorer/lib/python2.7/site-packages/hicexplorer/hicBuildMatrix.py", line 644, in main
"the --reorder option".format(mate1.qname, mate2.qname)
AssertionError: FATAL ERROR A00387:60:GW181017000:1:1103:19877:20290 A00387:60:GW181017000:1:1216:26838:31375 Be sure that the sam files have the same read order If using Bowtie2 or Hisat2 add the --reorder option
I get following error. I don't understand. What is the problem.
Could you please shed some light.
Thank you.