pontussk / PMDtools

Compute postmortem damage patterns and decontaminate ancient genomes
GNU General Public License v3.0
15 stars 7 forks source link

Python3 Incompatibilities #6

Open apeltzer opened 5 years ago

apeltzer commented 5 years ago

So apparently there are quite some issues with Python3+, making reading input files quite difficult.

a.) UTF-8/Latin-1 file encodings are quite difficult to handle in Python3, especially forcing these to be consistent on how input reading was handled automatically by Python 2.

cf for details:

https://stackoverflow.com/questions/12468179/unicodedecodeerror-utf8-codec-cant-decode-byte-0x9c

b.) I guess the easiest way would be adding pysam (which is installable via python-pip, conda etc pp) and then rely on this as reading/writing library for SAM/BAM compatibility. One could even have automatic MD tagging activated, making the process easier for users too.

pontussk commented 5 years ago

Hi Alex,

This would be great but one of the goals I have had for PMDtools has been to have minimal dependencies (only samtools in principle). Is there any possibility other than pysam for this?

Best, Pontus

On Sat, Nov 3, 2018 at 6:27 PM Alexander Peltzer notifications@github.com wrote:

So apparently there are quite some issues with Python3+, making reading input files quite difficult.

a.) UTF-8/Latin-1 file encodings are quite difficult to handle in Python3, especially forcing these to be consistent on how input reading was handled automatically by Python 2.

cf for details:

https://stackoverflow.com/questions/12468179/unicodedecodeerror-utf8-codec-cant-decode-byte-0x9c

b.) I guess the easiest way would be adding pysam (which is installable via python-pip, conda etc pp) and then rely on this as reading/writing library for SAM/BAM compatibility. One could even have automatic MD tagging activated, making the process easier for users too.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/pontussk/PMDtools/issues/6, or mute the thread https://github.com/notifications/unsubscribe-auth/AF3_IyKcbQjw2KWZNq2KDECAxoxE_dbzks5ureAVgaJpZM4YM7zd .

apeltzer commented 5 years ago

Hi Pontus,

yes, we could work on fixing these issues in the code directly - @boulund apparently ported the code basis to Python3 without changing that, so we could ask him again whether he can share what he changed :-)

cf https://github.com/nf-core/eager/issues/36

boulund commented 5 years ago

I haven't had time to look at this in detail, maybe my code has the same issue with encodings as you've encountered?

Please have a look at the start of the rewrite I made here: https://github.com/boulund/PMDtools/tree/boulund-rewrite

Note that I made my rewrite of this a while back, before the code was put online under a permissive license, so if there are recent changes they are probably not incorporated in my version of the code.

pontussk commented 5 years ago

Thanks Fredrik that is interesting.

On Thu, Nov 8, 2018 at 12:43 PM Fredrik Boulund notifications@github.com wrote:

I haven't had time to look at this in detail, maybe my code has the same issue with encodings as you've encountered?

Please have a look at the rewrite I made here: https://github.com/boulund/PMDtools/tree/boulund-rewrite

Note that I made my rewrite of this a while back, before the code was put online under a permissive license.

— You are receiving this because you commented.

Reply to this email directly, view it on GitHub https://github.com/pontussk/PMDtools/issues/6#issuecomment-436981657, or mute the thread https://github.com/notifications/unsubscribe-auth/AF3_I55ys43q6kNem6M2yWx6iPFW2Sq1ks5utCcIgaJpZM4YM7zd .

rsh249 commented 3 years ago

Was there a solution to the python3 incompatibility?

apeltzer commented 3 years ago

None that I'm aware of unfortunately so far. I#ve not had time to have a more detailed look but would be happy to see such an update :-)

boulund commented 3 years ago

I haven't compared the results from my python3 conversion with the original code, but it runs in python 3 and produces reasonably looking output.

rsh249 commented 3 years ago

I made an attempt to make the original script work with python3 and also get sensible results (so far).

https://github.com/rsh249/PMDtools *cloned repository with fixes for python3

I’ll check out yours @boulund too!

apeltzer commented 3 years ago

If this works reliably, could be a good thing to open a PR to this repository and get it updated in a new release maybe ?

rsh249 commented 3 years ago

I will keep testing and let you know if my version seems OK for a PR. Thanks all!

pontussk commented 3 years ago

Thanks all for looking at porting this to python3. Let me know what I can do.

Pontus

On Mon, 1 Feb 2021 at 01:59, Rob Harbert notifications@github.com wrote:

I will keep testing and let you know if my version seems OK for a PR. Thanks all!

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/pontussk/PMDtools/issues/6#issuecomment-770513040, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABO76I2HAQJF37X4WQQXRXTS4YDHHANCNFSM4GBTXTOQ .

boulund commented 2 years ago

Is there still interest in a Python3 port of PMDtools ? If there are any significant updates to the original code of version 0.6 (which is labelled 0.5 in the code) I could add them to my py3 port and create a PR to this repo, if anyone still wants it. Maybe PMDtools is good enough as is, or people have moved on to something else? Is there a good test data set that could be used to verify that the output doesn't change?

pontussk commented 1 year ago

Hi Fredrik,

I would certainly be interested in this. Thanks!

Best, Pontus

On Mon, 5 Sept 2022 at 15:11, Fredrik Boulund @.***> wrote:

Is there still interest in a Python3 port of PMDtools ? If there are any significant updates to the original code of version 0.6 (which is labelled 0.5 in the code) I could add them to my py3 port and create a PR to this repo, if anyone still wants it. Maybe people have moved on to something else?

— Reply to this email directly, view it on GitHub https://github.com/pontussk/PMDtools/issues/6#issuecomment-1237097229, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABO76I62E37G667BZYRKZNTV4X5P5ANCNFSM4GBTXTOQ . You are receiving this because you commented.Message ID: @.***>