Open alastair opened 4 years ago
https://github.com/MTG/gaia/blob/c72b8f1f744d699b968e92a8d6f8b20b6dde2378/src/bindings/pygaia/classification/classificationtask.py#L200 This doesn't work in python3, as pickle.load requires bytes, but sys.stdin is a string. Ideally this would be fixed by #96, allowing us to remove cluster mode, but in the meantime we should work out a solution to this specific problem.
In python 3, taskhash
is generating different hashes for the same combination of parameters:
https://github.com/MTG/gaia/blob/f227e410baff51b9ccc5af1578171d86886a1f1c/src/bindings/pygaia/classification/classificationtaskmanager.py#L42
when I run this script multiple times, it doesn't skip already done jobs as it does in python 2: https://github.com/MTG/gaia/blob/c72b8f1f744d699b968e92a8d6f8b20b6dde2378/src/bindings/pygaia/classification/classificationtask.py#L174
Running the same script multiple types in the same docker container results in different hashes each time. There is a random seed set in the project file.
I've replaced it with sha1(json.dumps(config))
, and now it's stable
Running in python 3, clustermode=False, after about 290 jobs I get an exception
Exception in thread Thread-1:
Traceback (most recent call last):
File "/usr/lib/python3.6/threading.py", line 916, in _bootstrap_inner
self.run()
File "/usr/lib/python3.6/threading.py", line 864, in run
self._target(*self._args, **self._kwargs)
File "/usr/lib/python3.6/concurrent/futures/process.py", line 295, in _queue_management_worker
shutdown_worker()
File "/usr/lib/python3.6/concurrent/futures/process.py", line 253, in shutdown_worker
call_queue.put_nowait(None)
File "/usr/lib/python3.6/multiprocessing/queues.py", line 129, in put_nowait
return self.put(obj, False)
File "/usr/lib/python3.6/multiprocessing/queues.py", line 83, in put
raise Full
queue.Full
This has happened more than once. Is something not cleaning up properly? Maybe a difference between concurrent.futures in python 3.6 and the backport we use in python 2?
:heavy_check_mark: This item is fixed in #115
There is still some python code which only runs in python 2. Most importantly are uses of basestring and unicode: https://github.com/MTG/gaia/blob/ed433ed3f1fa3ceea1ccbb77929fdb10ddaa8bdd/src/bindings/pythonic.i#L24-L31
These will have to be converted, but we have to understand the use of the types in this method. Theoretically both basestring and unicode can be changed to
str
, but in python 3, we should double-check where items should be actual strings, and where they should be encoded bytes.