MTG / gaia

C++ library to apply similarity measures and classifications on the results of audio analysis, including Python bindings. Together with Essentia it can be used to compute high-level descriptions of music.
http://essentia.upf.edu
GNU Affero General Public License v3.0
271 stars 66 forks source link

pending python 3 upgrade tasks #111

Open alastair opened 4 years ago

alastair commented 4 years ago

:heavy_check_mark: This item is fixed in #115

There is still some python code which only runs in python 2. Most importantly are uses of basestring and unicode: https://github.com/MTG/gaia/blob/ed433ed3f1fa3ceea1ccbb77929fdb10ddaa8bdd/src/bindings/pythonic.i#L24-L31

These will have to be converted, but we have to understand the use of the types in this method. Theoretically both basestring and unicode can be changed to str, but in python 3, we should double-check where items should be actual strings, and where they should be encoded bytes.

alastair commented 4 years ago

https://github.com/MTG/gaia/blob/c72b8f1f744d699b968e92a8d6f8b20b6dde2378/src/bindings/pygaia/classification/classificationtask.py#L200 This doesn't work in python3, as pickle.load requires bytes, but sys.stdin is a string. Ideally this would be fixed by #96, allowing us to remove cluster mode, but in the meantime we should work out a solution to this specific problem.

alastair commented 4 years ago

In python 3, taskhash is generating different hashes for the same combination of parameters: https://github.com/MTG/gaia/blob/f227e410baff51b9ccc5af1578171d86886a1f1c/src/bindings/pygaia/classification/classificationtaskmanager.py#L42

when I run this script multiple times, it doesn't skip already done jobs as it does in python 2: https://github.com/MTG/gaia/blob/c72b8f1f744d699b968e92a8d6f8b20b6dde2378/src/bindings/pygaia/classification/classificationtask.py#L174

Running the same script multiple types in the same docker container results in different hashes each time. There is a random seed set in the project file.

I've replaced it with sha1(json.dumps(config)), and now it's stable

alastair commented 4 years ago

Running in python 3, clustermode=False, after about 290 jobs I get an exception

Exception in thread Thread-1:
Traceback (most recent call last):
  File "/usr/lib/python3.6/threading.py", line 916, in _bootstrap_inner
    self.run()
  File "/usr/lib/python3.6/threading.py", line 864, in run
    self._target(*self._args, **self._kwargs)
  File "/usr/lib/python3.6/concurrent/futures/process.py", line 295, in _queue_management_worker
    shutdown_worker()
  File "/usr/lib/python3.6/concurrent/futures/process.py", line 253, in shutdown_worker
    call_queue.put_nowait(None)
  File "/usr/lib/python3.6/multiprocessing/queues.py", line 129, in put_nowait
    return self.put(obj, False)
  File "/usr/lib/python3.6/multiprocessing/queues.py", line 83, in put
    raise Full
queue.Full

This has happened more than once. Is something not cleaning up properly? Maybe a difference between concurrent.futures in python 3.6 and the backport we use in python 2?