usegalaxy-no / galaxyadmin

A repository for managing the work of the usegalaxy.no GalaxyAdmin team
0 stars 0 forks source link

Convert tool fails - tool error #31

Closed ehj000 closed 3 years ago

ehj000 commented 3 years ago

The Convert tool (several tools with same name installed, but it is this: toolshed.g2.bx.psu.edu/repos/miller-lab/genome_diversity/gd_multiple_to_gd_genotype/1.0.0) fail with the following tool error: a bytes-like object is required, not 'str'

I am not able to resolve this, but wonder if it is caused by a python 2 vs python 3 difference as similar issued are reported by others (not for this tool in Galaxy):"The reason for this error is that in Python 3, strings are Unicode, but when transmitting on the network, the data needs to be bytes instead"....

The failed job is this: Job API ID: | 97037a4486d4e424 (9270)

kjellp commented 3 years ago

@torfinnnome is python3 in the default "container" image, and what will be used if the tool dependencies do not specify a python version?

torfinnnome commented 3 years ago

Yes, python3 is default:

usegalaxy.no $ singularity exec /srv/galaxy/containers/galaxy-python.sif python --version
Python 3.6.8

This tool package looks quite old (last updated 2015). To switch to python2 for this/these tools, the xml file has to be altered. Something like this might work:

https://toolshed.g2.bx.psu.edu/repos/miller-lab/genome_diversity/file/tip/multiple_to_gd_genotype.xml#l4

Replace <command interpreter="python"> with <command interpreter="python2">

Not sure what the best strategy to modify these old tools. @kjetilkl, any suggestions? Updating the file(s) on the server (/srv/galaxy/var/shed_tools/toolshed.g2.bx.psu.edu/repos/miller-lab/genome_diversity/e56023008e36/genome_diversity/*xml) will work, but not exactly the optimal way of doing it. Need some sort of repository.

ehj000 commented 3 years ago

Thanks for looking into to this. Parallel to this, I have asked the user if there are any equivalent tools she can use, but this problem might appear for other tools as well, so we should probably find a good solution

torfinnnome commented 3 years ago

Perhaps we should use a default container image which defaults to python2? I'm not sure if this will break other tools, tho.

All "newer" tools should not use this default container image anyway, so it might be safe. I don't know. We can try it out on test.usegalaxy.no?

kjellp commented 3 years ago

Sounds reasonable to me test out on test.usegalaxy.no, if it's easy to revert after the test.

On 07/01/2021 12.38, Torfinn Nome wrote:

Perhaps we should use a default container image which defaults to python2? I'm not sure if this will break other tools, tho.

All "newer" tools should not use this default container image anyway, so it might be safe. I don't know. We can try it out on test.usegalaxy.no?

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/usegalaxy-no/galaxyadmin/issues/31#issuecomment-756064270, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABAHO7QV3BFJ6OSZAECRC63SYWML3ANCNFSM4VXUAQ2A.

ehj000 commented 3 years ago

That would be great. I can test this once it is installed.

torfinnnome commented 3 years ago

The default Singularity container is now running python 2.7 by default (python 3 also available with python3 command).

ehj000 commented 3 years ago

Not sure what this implies, does this mean that the tool must be reinstalled in a particular way on test, or how do I specify that this tool should use python2?

torfinnnome commented 3 years ago

No need to do anything, apart from running the tool(s). Hopefully it will not fail now (I have not tested it yet).

ehj000 commented 3 years ago

Unfortunately the tool fails with the same error. I am quite certain the the input file is valid since I am able to run other vcf manipulation tools using the same file

torfinnnome commented 3 years ago

Hm, looking at the log files, it looks like it's using some python3 libraries. Strange. Will look into it. PS: https://test.usegalaxy.no/reports/ should work now. Same login as on usegalaxy.no/reports/.

torfinnnome commented 3 years ago

The base64 library acts a bit different with Python 3. This might fix the issue:

pwd: /srv/galaxy/var/shed_tools/toolshed.g2.bx.psu.edu/repos/miller-lab/genome_diversity/e56023008e36/genome_diversity/

--- multiple_to_gd_genotype.xml.orig    2021-01-12 11:57:21.101102148 +0100
+++ multiple_to_gd_genotype.xml 2021-01-12 13:23:32.522127720 +0100
@@ -3,7 +3,7 @@

   <command interpreter="python">
     #import base64
-    #set species_arg = base64.b64encode(str($species))
+    #set species_arg = base64.b64encode(str($species.encode())).decode()
     multiple_to_gd_genotype.py --input '$input' --output '$output' --dbkey '$dbkey' --species '$species_arg' --format '$input_format'
   </command>
ehj000 commented 3 years ago

The user has provided me with a new vcf input file, but the tool still fails. The stack trace in the report finish with this: NameError: name 'basestring' is not defined

I think it is still some python version issues - see this: https://stackoverflow.com/questions/34803467/unexpected-exception-name-basestring-is-not-defined-when-invoking-ansible2

Should we continue digging or abandon the ship?

kjetilkl commented 3 years ago

I am no expert in Python, but from what I have read, "basestring" was apparently an abstract supertype of "str" and "unicode" in Python prior to version 3, but it was removed in Python 3.

torfinnnome commented 3 years ago

The full log:

Jan 18 13:40:23 test.usegalaxy.no uwsgi[29303]: galaxy.jobs.runners ERROR 2021-01-18 13:40:23,117 [p:29538,w:0,m:1] [SlurmRunner.work_thread-1] (247/179) Job wrapper finish method failed
Jan 18 13:40:23 test.usegalaxy.no uwsgi[29303]: Traceback (most recent call last):
Jan 18 13:40:23 test.usegalaxy.no uwsgi[29303]: File "/srv/galaxy/server/lib/galaxy/jobs/runners/__init__.py", line 540, in _finish_or_resubmit_job
Jan 18 13:40:23 test.usegalaxy.no uwsgi[29303]: job_wrapper.finish(tool_stdout, tool_stderr, exit_code, check_output_detected_state=check_output_detected_state, job_stdout=job_stdout, job_stderr=job_stderr)
Jan 18 13:40:23 test.usegalaxy.no uwsgi[29303]: File "/srv/galaxy/server/lib/galaxy/jobs/__init__.py", line 1682, in finish
Jan 18 13:40:23 test.usegalaxy.no uwsgi[29303]: output_name, dataset, job, context, final_job_state, remote_metadata_directory
Jan 18 13:40:23 test.usegalaxy.no uwsgi[29303]: File "/srv/galaxy/server/lib/galaxy/jobs/__init__.py", line 1539, in _finish_dataset
Jan 18 13:40:23 test.usegalaxy.no uwsgi[29303]: dataset.datatype.set_meta(dataset, overwrite=False)
Jan 18 13:40:23 test.usegalaxy.no uwsgi[29303]: File "/srv/galaxy/var/shed_tools/toolshed.g2.bx.psu.edu/repos/miller-lab/genome_diversity/e56023008e36/genome_diversity/lib/galaxy/datatypes/wsf.py", line 90, in set_meta
Jan 18 13:40:23 test.usegalaxy.no uwsgi[29303]: self.set_dataset_metadata_from_comments( dataset )
Jan 18 13:40:23 test.usegalaxy.no uwsgi[29303]: File "/srv/galaxy/var/shed_tools/toolshed.g2.bx.psu.edu/repos/miller-lab/genome_diversity/e56023008e36/genome_diversity/lib/galaxy/datatypes/wsf.py", line 150, in set_dataset_metadata_from_comments
Jan 18 13:40:23 test.usegalaxy.no uwsgi[29303]: Fake.set_dataset_metadata_from_comments( self, dataset )
Jan 18 13:40:23 test.usegalaxy.no uwsgi[29303]: File "/srv/galaxy/var/shed_tools/toolshed.g2.bx.psu.edu/repos/miller-lab/genome_diversity/e56023008e36/genome_diversity/lib/galaxy/datatypes/wsf.py", line 112, in set_dataset_metadata_from_comments
Jan 18 13:40:23 test.usegalaxy.no uwsgi[29303]: self.set_dataset_species_metadata( dataset )
Jan 18 13:40:23 test.usegalaxy.no uwsgi[29303]: File "/srv/galaxy/var/shed_tools/toolshed.g2.bx.psu.edu/repos/miller-lab/genome_diversity/e56023008e36/genome_diversity/lib/galaxy/datatypes/wsf.py", line 134, in set_dataset_species_metadata
Jan 18 13:40:23 test.usegalaxy.no uwsgi[29303]: if isinstance( value_from_comment_metadata, basestring ):
Jan 18 13:40:23 test.usegalaxy.no uwsgi[29303]: NameError: name 'basestring' is not defined

So. Galaxy has moved to Python 3. And it seems this tool (or rather tools in this package) use this dataset script that is not ported to Python 3. And hence fails.

This dirty hack seems to fix it:

diff -uN /srv/galaxy/var/shed_tools/toolshed.g2.bx.psu.edu/repos/miller-lab/genome_diversity/e56023008e36/genome_diversity/lib/galaxy/datatypes/wsf.py.orig /srv/galaxy/var/shed_tools/toolshed.g2.bx.psu.edu/repos/miller-lab/genome_diversity/e56023008e36/genome_diversity/lib/galaxy/datatypes/wsf.py
--- /srv/galaxy/var/shed_tools/toolshed.g2.bx.psu.edu/repos/miller-lab/genome_diversity/e56023008e36/genome_diversity/lib/galaxy/datatypes/wsf.py.orig  2021-01-18 14:01:54.278686698 +0100
+++ /srv/galaxy/var/shed_tools/toolshed.g2.bx.psu.edu/repos/miller-lab/genome_diversity/e56023008e36/genome_diversity/lib/galaxy/datatypes/wsf.py   2021-01-18 14:06:11.708987357 +0100
@@ -131,12 +131,12 @@

     def set_dataset_species_metadata( self, dataset ):
         value_from_comment_metadata = dataset.metadata.comment_metadata.get( 'species', None )
-        if isinstance( value_from_comment_metadata, basestring ):
+        if isinstance( value_from_comment_metadata, str ):
             dataset.metadata.species = value_from_comment_metadata

     def set_dataset_dbkey_metadata( self, dataset ):
         value_from_comment_metadata = dataset.metadata.comment_metadata.get( 'dbkey', '?' )
-        if isinstance( value_from_comment_metadata, basestring ):
+        if isinstance( value_from_comment_metadata, str ):
             dataset.metadata.dbkey = value_from_comment_metadata

 class GDSnp( Fake ):
@@ -162,7 +162,7 @@
             if not isinstance( individual, list ) or len( individual ) != 2:
                 continue
             name, col = individual
-            if not isinstance( name, basestring ):
+            if not isinstance( name, str ):
                 name = ''
             try:
                 c = int( col )

At least my small test turned green now.

Handling all these ... ancient tools is very, very time consuming and annoying. So you should discuss if we want to support them or not.

ehj000 commented 3 years ago

I agree that this needs to be discussed. We do not have unlimited resources for doing support. Could this be a point on the next meeting? Your problem solving is much appreciated. I guess this is only on the test-usegalaxy?

torfinnnome commented 3 years ago

I updated usegalaxy.no as well.

Should also be discussed: We need a way to keep a log of modifications to our tools.

ehj000 commented 3 years ago

Ok, thanks. The tools still fails on the usegalaxy with the initial error (a bytes-like object is required), but runs on test.

torfinnnome commented 3 years ago

Try again now?

ehj000 commented 3 years ago

Now it is working. Thank you very much Torfinn