peterjc / pico_galaxy

Galaxy tools and wrappers for sequence analysis
17 stars 25 forks source link

pure python Venn Diagrams #33

Closed FredericBGA closed 5 years ago

FredericBGA commented 5 years ago

Here is a pure python implementation (using Konstantin Tretyakov's matplotlib-venn package) Works here for me. I'm not sure that everything will be suitable for all galaxy instances. Ie: I need matplotlib.use('Agg') in order to be able to draw the plot within PBSPro.

peterjc commented 5 years ago

Is this intended as an alternative to rpy2 for solving #32?

Note there's a whole bunch of Python style issues, running black will help with some.

3.84s$ flake8 .
./tools/venn_list/venn_list.py:3:1: D400 First line should end with a period
./tools/venn_list/venn_list.py:3:1: D205 1 blank line required between summary line and description
./tools/venn_list/venn_list.py:49:1: D210 No whitespaces allowed surrounding docstring text
./tools/venn_list/venn_list.py:49:1: D403 First word of the first line should be properly capitalized
./tools/venn_list/venn_list.py:49:1: D401 First line should be in imperative mood
./tools/venn_list/venn_list.py:49:1: D400 First line should end with a period
./tools/venn_list/venn_list.py:49:1: D300 Use """triple double quotes"""
./tools/venn_list/venn_list.py:81:1: D210 No whitespaces allowed surrounding docstring text
./tools/venn_list/venn_list.py:81:1: D403 First word of the first line should be properly capitalized
./tools/venn_list/venn_list.py:81:1: D400 First line should end with a period
./tools/venn_list/venn_list.py:81:1: D300 Use """triple double quotes"""
./tools/venn_list/venn_list.py:90:1: E305 expected 2 blank lines after class or function definition, found 1

I would suggest this deserves at least a minor version bump, not just a revision bump, as you've done a lot of work here.

FredericBGA commented 5 years ago

yes, I wanted to solve #32 I've run flake8 locally, it will be ok now I hope. I've update the version to 0.1.00 (in XML as well)

peterjc commented 5 years ago

Getting rpy and limma on TravisCI was a pain so until now this wasn't being automatically tests. The new dependencies make this possible 👍

However, something up with the tests... perhaps a more general issue than just this tool?

======================================================================
FAIL: ( venn_list ) > Test-1
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/travis/build/peterjc/pico_galaxy/galaxy-dev/test/functional/test_toolbox.py", line 99, in test_tool
    self.do_it(tool_version=tool_version, test_index=test_index)
  File "/home/travis/build/peterjc/pico_galaxy/galaxy-dev/test/functional/test_toolbox.py", line 36, in do_it
    verify_tool(tool_id, self.galaxy_interactor, resource_parameters=resource_parameters, test_index=test_index, tool_version=tool_version, register_job_data=register_job_data)
  File "/home/travis/build/peterjc/pico_galaxy/galaxy-dev/lib/galaxy/tools/verify/interactor.py", line 779, in verify_tool
    raise e
JobOutputsError: 'utf8' codec can't decode byte 0xac in position 10: invalid start byte
======================================================================
FAIL: ( venn_list ) > Test-2
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/travis/build/peterjc/pico_galaxy/galaxy-dev/test/functional/test_toolbox.py", line 99, in test_tool
    self.do_it(tool_version=tool_version, test_index=test_index)
  File "/home/travis/build/peterjc/pico_galaxy/galaxy-dev/test/functional/test_toolbox.py", line 36, in do_it
    verify_tool(tool_id, self.galaxy_interactor, resource_parameters=resource_parameters, test_index=test_index, tool_version=tool_version, register_job_data=register_job_data)
  File "/home/travis/build/peterjc/pico_galaxy/galaxy-dev/lib/galaxy/tools/verify/interactor.py", line 779, in verify_tool
    raise e
JobOutputsError: 'utf8' codec can't decode byte 0xac in position 10: invalid start byte
----------------------------------------------------------------------
peterjc commented 5 years ago

Using hexdump I don't see 0xac in any of our files...

$ hexdump -C test-data/magic.pdf test-data/rhodopsin_proteins.fasta test-data/venn_list.tabular | grep " ac "
$ hexdump -C tools/venn_list/* | grep " ac "
peterjc commented 5 years ago

Got it - something in the matplotlib output is upsetting Galaxy, based on one of the test cases:

$ python ../tools/venn_list/venn_list.py - - "Example" rhodopsin_proteins.fasta fasta "Proteins" venn_list.tabular tabular "Stuff" example.pdf
Doing 2-way Venn Diagram
Inferred total of 10 IDs
6 in Proteins
10 in Stuff
Done

And,

$ hexdump -C example.pdf | head -n 1
00000000  25 50 44 46 2d 31 2e 34  0a 25 ac dc 20 ab ba 0a  |%PDF-1.4.%.. ...|

There is the 0xac character as the eleventh charctacter, position 10 in Python counting, matching the exception:

JobOutputsError: 'utf8' codec can't decode byte 0xac in position 10: invalid start byte

Consider:

$ python -c "print(open('example.pdf').readline())"
%PDF-1.4

vs:

$ python3 -c "print(open('example.pdf').readline())"
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/codecs.py", line 322, in decode
    (result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xac in position 10: invalid start byte

(Galaxy does still seem to be running under Python 2 on the failing test, but must nevertheless being doing something with encodings to trigger this - probably in the sniffer code)

FredericBGA commented 5 years ago

thank you for the last stage of polishing! I didn't know black.

it works here on a 19-01 galaxy server.

I've launched planemo test database/shed_tools/toolshed.g2.bx.psu.edu/repos/peterjc/venn_list/20d347feb882/venn_list/tools/venn_list/venn_list.xml

peterjc commented 5 years ago

It looks like https://github.com/galaxyproject/galaxy/issues/7957 is causing the failure in Galaxy 19.05, but it is being worked on so ought to be fixed this week (I hope).

I've only recently started using black on this repository, and ought to write a CONTRIBUTING file explaining this.

FredericBGA commented 5 years ago

It looks like galaxyproject/galaxy#7957 is causing the failure in Galaxy 19.05, but it is being worked on so ought to be fixed this week (I hope).

:+1:

I've only recently started using black on this repository, and ought to write a CONTRIBUTING file explaining this.

Black is a tough guy! I knew pylint, then flake8. But black does not like compromise (and he's proud of that...)

peterjc commented 5 years ago

https://github.com/galaxyproject/galaxy/issues/7957 has been fixed, re-running TravisCI for Galaxy dev branch...

peterjc commented 5 years ago

Hmm, maybe the fix wasn't on the branch I expected yet - I will ask on the Galaxy issue.

peterjc commented 5 years ago

Should now be fixed on Galaxy dev via https://github.com/galaxyproject/galaxy/pull/8010 - retesting...

peterjc commented 5 years ago

Nope, the release_19.05 branch should be fixed, but not the dev branch (yet).

peterjc commented 5 years ago

Yay, Galaxy issue fixed, tests passed, merged. This should be on the Test Tool Shed now - if that looks all fine, time to push to the Main Tool Shed...