ReproNim / testkraken

Generalized regression testing of scientific workflows
3 stars 9 forks source link

try_data_download doesn't work for python 3.8 #93

Open djarecka opened 3 years ago

djarecka commented 3 years ago

@leej3 - I've noticed that the datalad part doesn't work with python 3.8, I have an error from try_data_download when running testkraken testkraken/workflows4regtests/afni_dc2019/3dcopy_datalad (or pytest -vs testkraken/tests/test_afni.py)

Traceback (most recent call last):
  File "/Users/dorota/miniconda3/envs/tmp_datalad_testkraken_py38/bin/testkraken", line 33, in <module>
    sys.exit(load_entry_point('testkraken', 'console_scripts', 'testkraken')())
  File "/Users/dorota/miniconda3/envs/tmp_datalad_testkraken_py38/lib/python3.8/site-packages/click/core.py", line 829, in __call__
    return self.main(*args, **kwargs)
  File "/Users/dorota/miniconda3/envs/tmp_datalad_testkraken_py38/lib/python3.8/site-packages/click/core.py", line 782, in main
    rv = self.invoke(ctx)
  File "/Users/dorota/miniconda3/envs/tmp_datalad_testkraken_py38/lib/python3.8/site-packages/click/core.py", line 1066, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/Users/dorota/miniconda3/envs/tmp_datalad_testkraken_py38/lib/python3.8/site-packages/click/core.py", line 610, in invoke
    return callback(*args, **kwargs)
  File "/Users/dorota/testkraken/testkraken/cli.py", line 24, in main
    wf.run()
  File "/Users/dorota/testkraken/testkraken/workflowregtest.py", line 117, in run
    self._run_workflow_in_matrix_of_envs()
  File "/Users/dorota/testkraken/testkraken/workflowregtest.py", line 160, in _run_workflow_in_matrix_of_envs
    self._run_pydra(image=image, soft_ver_str=name)
  File "/Users/dorota/testkraken/testkraken/workflowregtest.py", line 218, in _run_pydra
    process_path_obj(value, self.data_path)
  File "/Users/dorota/testkraken/testkraken/data_management.py", line 205, in process_path_obj
    fetch_status = try_data_download(files_to_fetch, test_data_dir, logger)
  File "/Users/dorota/testkraken/testkraken/data_management.py", line 236, in try_data_download
    process_for_fetching_data.start()
  File "/Users/dorota/miniconda3/envs/tmp_datalad_testkraken_py38/lib/python3.8/multiprocessing/process.py", line 121, in start
    self._popen = self._Popen(self)
  File "/Users/dorota/miniconda3/envs/tmp_datalad_testkraken_py38/lib/python3.8/multiprocessing/context.py", line 224, in _Popen
    return _default_context.get_context().Process._Popen(process_obj)
  File "/Users/dorota/miniconda3/envs/tmp_datalad_testkraken_py38/lib/python3.8/multiprocessing/context.py", line 284, in _Popen
    return Popen(process_obj)
  File "/Users/dorota/miniconda3/envs/tmp_datalad_testkraken_py38/lib/python3.8/multiprocessing/popen_spawn_posix.py", line 32, in __init__
    super().__init__(process_obj)
  File "/Users/dorota/miniconda3/envs/tmp_datalad_testkraken_py38/lib/python3.8/multiprocessing/popen_fork.py", line 19, in __init__
    self._launch(process_obj)
  File "/Users/dorota/miniconda3/envs/tmp_datalad_testkraken_py38/lib/python3.8/multiprocessing/popen_spawn_posix.py", line 47, in _launch
    reduction.dump(process_obj, fp)
  File "/Users/dorota/miniconda3/envs/tmp_datalad_testkraken_py38/lib/python3.8/multiprocessing/reduction.py", line 60, in dump
    ForkingPickler(file, protocol).dump(obj)
NotImplementedError: object proxy must define __reduce_ex__()

Have you had this error before?

yarikoptic commented 3 years ago

never saw such a thing. I would jump into pdb and check what is that obj really is (where coming from etc). I do not see anything obvious when `git grep 'proxy>' | grep '.py' within datalad -- so might be some dependency

djarecka commented 3 years ago

i guess it's my lucky week.. seeing weird bugs everywhere... will debug!

leej3 commented 3 years ago

I haven’t seen that either. Previously I experienced asyncio problems but I don’t ever remember problems with multiprocessing.

Happy to help with some live debugging next week if you think that would be of use.

On Jan 22, 2021, at 12:03 AM, Dorota Jarecka notifications@github.com wrote:

 i guess it's my lucky week.. seeing weird bugs everywhere... will debug!

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.

djarecka commented 3 years ago

you say asyncio? we might need help with this as well ;-) will let you know if I figure this out!

yarikoptic commented 3 years ago

yeah, nothing so far in this particular issue points to asyncio ;)

djarecka commented 3 years ago

it looks like I'm not able to run any method on datalad dataset using Process, I get always error from Process.start. For now I will just use dataset.get without Process in case of exception is raised.

I believe this is just between datalad and multiprocessing, but not completely sure...

yarikoptic commented 3 years ago

Do you have a small reproducer?

djarecka commented 3 years ago

try this with py3.8

import datalad.api as datalad
dl_dset = datalad.Dataset("blah")
datalad.clone('https://github.com/afni/afni_data.git', dl_dset.path)
from multiprocessing import Process
fetching_data = Process(target=dl_dset.get, kwargs={"path":'atlases/MNI152_2009_template.nii.gz'})
fetching_data.start()
yarikoptic commented 3 years ago

thanks! seems yet another OSX gotcha (seems to work on linux :-/)

yarikoptic commented 3 years ago

boils down to wrapt ... more details or fixes (if to come) will be in that https://github.com/datalad/datalad/pull/5369