arq5x / bedtools2

bedtools - the swiss army knife for genome arithmetic
MIT License
929 stars 287 forks source link

Switching order of -a and -b paramter in bedtools intersect when -b file is BAM file, outputs BAM header. #497

Closed ghuls closed 5 months ago

ghuls commented 7 years ago
cd test/intersect

bedtools  intersect -a j1.bed -b oneUnmapped.bam
chr1    98650   98704   a1  100 +

$ bedtools  intersect -b oneUnmapped.bam -a j1.bed 
chr1    98650   98704   a1  100 +
�BC�Y]lW>�I���"�V/(�����k��]�v�fm�MM-���z��gg�8!UBE J�R)R�~���    �
                                                                                 *���T�?��*/�P��ϝg��;~8��w���~{Ϲ���7'n|
                                                                                                                            `������,��Jkm��y�nߵ�Neym��\gz���Jku��=•3��� ‘Ape�5��âݳǻ�-��z^m���k��aH" �BP�ji��BD���E@�9#���!�Bsb��h�@�%�aj����a�$�
�̠���� ��L4��H�9e&��$�1!�c��$BR�cKpd�,��i� #� �2��
                                                         p�D8�T�
    $���:4�!�,�'X+֐@��L�0�����%��tB%!
���kw�=�޶�p�Ĉ>��f�b   �e���!�I/F���&[a!Dh�v�;z !{8��qC�v�D�JNLĵ���Qn�=��`3\������ ��E��n�F�3�b��B2^$
                                                    �!��)8�:W����
                                                                 �4Z#�uF{h�n��E,J�j�t��
�<��QJ��a�/o�o@q�����o7�sX\��*�#l�7�N-���x���S'f։�M��L('VƉ@�*�Dd�TG���BGRk��U�K���֫DXy�^]
�.��ޱa!�=q��3�M�%{yY������ꛌ[���r�aI�Cyɒ_��(���C�2�^��_�1Z��(��,��#����î�3���R|�[��uN7�GM���B��Zej����Q�l�tn���d�ֹ�R��1+���&
                                                                                                                                             ��{(�G�1S`��"�G-�\_��K�q��RQ*~���йR<&R|�?��-w|��E�Jp���G�[L�F���~B�R��!�5D�ç�DɤV���F�52x)�Ө�悕
                                                            >j������x��rO��Z�������[ZMu
iG�(W4h�:J-R*x���K��E�:YzJцE����[=b��X�X�yR*�,b�l7ʭnD ����δƕ)<f�Y��`�,��Tʳ˕���Յ���7�
                                                                                         ��ҁw��jh�����7���s+��a�L*Tg����7t���2?�w��{3��o�#�[�9���z_}��]b���h�3^�5߬��Q_�}q\��ZC����������+n��u�<8� -��wŝ����
���>���+n�z�����#�W����\��8C�n�ذ���`{��n�+N�A"����ڨW������䧵�VϬ-��9��7�.y���c�}���5T�x��������χ��;����dp��[��>0�ت:�x��]M�4P
�*���,=�Ү�셲��#����`����~��9\�����~�]����bK@ Rc3b��)q����%8&���4�x2<q�KJ�����|_,0���U�����t,��I�z
                                                                                                          _��=
                                                                                                                 ��b���xQ�>��D��*�O$wv�~��(l\��q%1����D�0���ɀ��lS� o����Q�s�n$��g�{|*���L$
              �ޟ'�D��w'��-<��js���;_U�&�:�����cd��݉�Q�X��
                                                          '�q�ӡ/\��(Ln�
                                                                        �nM��j��ߚ����$�W��W�R����[�WR������K*�������~�&�p�Zƅ�S�7���L��
<7S��7�]�F�0����o%��hn�'rF5������˘N������P�Q�{p7��GQ1��Ϟ�7A��\��S��&�\c"oM����&��$}���z�M���{Rkn<ܙ�[�
���ɼ��__����cj���@�a��j��@m����_�/P                                                                           �&���9pj��)
                                      x�ռZ^ Ӈj
                                              ��j�
                                                   vI(���5��?/Ps�eM��\��盘p����ڂ���@-��ͼ����
                                                                                               ���)P��sj
�?.P3h}����
            ���jN��`�������j����
                                     Ë?,P�*�   ��|�@��_�
�N?Y�6`���    K�(P[�S�[�ַ�
                            d���"�BCu
ghuls commented 7 years ago

It might be a more general problem with the parsing order of the arguments.

Example 1 should give the same result as example 4. Example 2 should give the same result as example 3.

$ printf 'chr1\t123456\n'
chr1    123456

$ printf 'chr1\t123456\t123457\n'
chr1    123456  123457

# Example 1
$ bin/bedtools intersect -a <(printf 'chr1\t123456\n') -b <(printf 'chr1\t123456\t123457\n')
ERROR: file /dev/fd/63 has non positional records, which are only valid for the groupBy tool.

Tool:    bedtools intersect (aka intersectBed)
Version: v2.26.0-50-gccdc919-dirty
Summary: Report overlaps between two feature files.

Usage:   bedtools intersect [OPTIONS] -a <bed/gff/vcf/bam> -b <bed/gff/vcf/bam>
...

# Example 2
$ bin/bedtools intersect -b <(printf 'chr1\t123456\n') -a <(printf 'chr1\t123456\t123457\n')
ERROR: file /dev/fd/63 has non positional records, which are only valid for the groupBy tool.

Tool:    bedtools intersect (aka intersectBed)
Version: v2.26.0-50-gccdc919-dirty
Summary: Report overlaps between two feature files.

Usage:   bedtools intersect [OPTIONS] -a <bed/gff/vcf/bam> -b <bed/gff/vcf/bam>
...

# Example 3
$ bin/bedtools intersect -a <(printf 'chr1\t123456\t123457\n') -b <(printf 'chr1\t123456\n')
ERROR: Received illegal bin number 4294967295 from getBin call.
ERROR: Unable to add record to tree.

# Example 4
$ bin/bedtools intersect -b <(printf 'chr1\t123456\t123457\n') -a <(printf 'chr1\t123456\n')
Segmentation fault (core dumped)
ghuls commented 7 years ago

After the patch for the first issue, the output of example 4 changed.

# Example 4
$ bin/bedtools intersect -b <(printf 'chr1\t123456\t123457\n') -a <(printf 'chr1\t123456\n')
***** WARNING: File  has inconsistent naming convention for record:
chr1    123456

***** WARNING: File  has inconsistent naming convention for record:
chr1    123456
M-gowda commented 5 years ago

I encountered the same error and I found that my bed files were prepared with windows type "End Of Line". Lesson learnt: make sure the bed files are prepared with linux type end of line (EOL). Solution: open the file in notepad++ How to check: View>show symbols>show end of line. CRLF -> windows and LF-> linux. How to convert: Edit>EOLconversion>Unix(LF)

M-gowda commented 5 years ago

the error was ERROR: file sorted_YH3K9me3.bed has non positional records, which are only valid for the groupBy tool.

dbolser commented 6 months ago

Also see dos2unix on the cli.

I think this can be closed?