open2c / pairtools

Extract 3D contacts (.pairs) from sequencing alignments
MIT License
104 stars 32 forks source link

pairtools sort issue #244

Closed ryvan1021 closed 2 months ago

ryvan1021 commented 4 months ago

Hello,

I'm running v1.1.0 and trying to run this pipeline per the Omni-C documentation and I keep running into this error:

pairtools sort --tmpdir=/path/to/temp --nproc 36 parsed.pairsam > sorted.pairsam
Traceback (most recent call last):
  File "/mnt/sas0/AD/rem2/.conda/envs/pairtools/bin/pairtools", line 11, in <module>
    sys.exit(cli())
             ^^^^^
  File "/mnt/sas0/AD/rem2/.conda/envs/pairtools/lib/python3.11/site-packages/click/core.py", line 1157, in __call__
    return self.main(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/mnt/sas0/AD/rem2/.conda/envs/pairtools/lib/python3.11/site-packages/click/core.py", line 1078, in main
    rv = self.invoke(ctx)
         ^^^^^^^^^^^^^^^^
  File "/mnt/sas0/AD/rem2/.conda/envs/pairtools/lib/python3.11/site-packages/click/core.py", line 1688, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/mnt/sas0/AD/rem2/.conda/envs/pairtools/lib/python3.11/site-packages/click/core.py", line 1434, in invoke
    return ctx.invoke(self.callback, **ctx.params)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/mnt/sas0/AD/rem2/.conda/envs/pairtools/lib/python3.11/site-packages/click/core.py", line 783, in invoke
    return __callback(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/mnt/sas0/AD/rem2/.conda/envs/pairtools/lib/python3.11/site-packages/pairtools/cli/__init__.py", line 183, in wrapper
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/mnt/sas0/AD/rem2/.conda/envs/pairtools/lib/python3.11/site-packages/pairtools/cli/sort.py", line 128, in sort
    sort_py(
  File "/mnt/sas0/AD/rem2/.conda/envs/pairtools/lib/python3.11/site-packages/pairtools/cli/sort.py", line 220, in sort_py
    for line in body_stream:
  File "<frozen codecs>", line 322, in decode
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xfe in position 1903: invalid start byte

I tried using iconv but it doesn't look like that worked. Any suggestions?

golobor commented 4 months ago

Hi, since pairtools sort are regularly tested, I would first assume that somethings is wrong with the input. The error looks very specific "...byte 0xfe in position 1903" - could it be that you're dealing with a corrupted file? Have you tried opening it with less and checking if it has reasonable content and if it's terminating properly?

golobor commented 4 months ago

alternatively, please check this thread: https://stackoverflow.com/questions/19699367/for-line-in-results-in-unicodedecodeerror-utf-8-codec-cant-decode-byte

We test pairtools only on Linux and only on .pairs[am] files produced by pairtools parse, in which case utf-8 normally works. Could it be that the input file was generated either on another OS or not by pairtools parse?

golobor commented 2 months ago

ok, closing for now - feel free to reopen if more information comes up