Open JGuetschow opened 1 year ago
Possible solutions off the top of my head:
datalad.api.run
from Python instead of the CLI—no limitations on str length thereI do run datalad.api.run from python
ah, okay, the error is from within datalad when it tries to run git
?
If you add a full traceback I can forward it upstream
Viewing the full output I now think it's known issue and a workaround exists (there was a lot of datalad output chucked in between the error output, so I didn't see the first part of the error)
Here is the (shortened) output:
[INFO] == Command exit (modification check follows) =====
[ERROR] Caught exception suggesting too large stack size limits. Hint: use 'ulimit -s' command to see current limit and e.g. 'ulimit -s 8192' to reduce it to avoid this exception. See https://github.com/datalad/datalad/issues/6106 for more information.
[WARNING] Received an exception OSError([Errno 7] Argument list too long: 'git'). Canceling not-yet running jobs and waiting for completion of running. You can force earlier forceful exit by Ctrl-C.
[INFO] Canceled 0 out of 0 jobs. 0 left running.
<some info on 'unlock', 'run', and 'add' command follows (all ok)>
Traceback (most recent call last):
File "<repo_path>/venv/lib/python3.8/site-packages/datalad/support/parallel.py", line 368, in _iter_threads
raise _FinalShutdown()
datalad.support.parallel._FinalShutdown
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "UNFCCC_GHG_data/UNFCCC_CRF_reader/read_new_UNFCCC_CRF_for_year_datalad.py", line 26, in <module>
read_new_crf_for_year_datalad(
File "<repo_path>/UNFCCC_GHG_data/UNFCCC_CRF_reader/UNFCCC_CRF_reader_prod.py", line 407, in read_new_crf_for_year_datalad
datalad.api.run(
File "<repo_path>/venv/lib/python3.8/site-packages/datalad/interface/base.py", line 773, in eval_func
return return_func(*args, **kwargs)
File "<repo_path>/venv/lib/python3.8/site-packages/datalad/interface/base.py", line 763, in return_func
results = list(results)
File "<repo_path>/venv/lib/python3.8/site-packages/datalad/interface/base.py", line 873, in _execute_command_
for r in _process_results(
File "<repo_path>/venv/lib/python3.8/site-packages/datalad/interface/utils.py", line 319, in _process_results
for res in results:
File "<repo_path>/venv/lib/python3.8/site-packages/datalad/core/local/run.py", line 297, in __call__
for r in run_command(cmd, dataset=dataset,
File "<repo_path>/venv/lib/python3.8/site-packages/datalad/core/local/run.py", line 1091, in run_command
for r in Save.__call__(
File "<repo_path>/venv/lib/python3.8/site-packages/datalad/interface/base.py", line 873, in _execute_command_
for r in _process_results(
File "<repo_path>/venv/lib/python3.8/site-packages/datalad/interface/utils.py", line 319, in _process_results
for res in results:
File "<repo_path>/venv/lib/python3.8/site-packages/datalad/core/local/save.py", line 391, in __call__
yield from ProducerConsumerProgressLog(
File "<repo_path>/venv/lib/python3.8/site-packages/datalad/support/parallel.py", line 535, in __iter__
for res in super().__iter__():
File "<repo_path>/venv/lib/python3.8/site-packages/datalad/support/parallel.py", line 265, in __iter__
yield from self._iter_threads(self._jobs)
File "<repo_path>/venv/lib/python3.8/site-packages/datalad/support/parallel.py", line 417, in _iter_threads
self.shutdown(force=True, exception=self._producer_exception or interrupted_by_exception)
File "<repo_path>/venv/lib/python3.8/site-packages/datalad/support/parallel.py", line 233, in shutdown
raise exception
File "<repo_path>/venv/lib/python3.8/site-packages/datalad/support/parallel.py", line 401, in _iter_threads
done_useful |= self._pop_done_futures(lgr)
File "<repo_path>/venv/lib/python3.8/site-packages/datalad/support/parallel.py", line 463, in _pop_done_futures
raise exception
File "/usr/lib/python3.8/concurrent/futures/thread.py", line 57, in run
result = self.fn(*self.args, **self.kwargs)
File "<repo_path>/venv/lib/python3.8/site-packages/datalad/support/parallel.py", line 329, in consumer_worker
for r in res:
File "<repo_path>/venv/lib/python3.8/site-packages/datalad/core/local/save.py", line 310, in save_ds
for res in pds_repo.save_(
File "<repo_path>/venv/lib/python3.8/site-packages/datalad/support/gitrepo.py", line 3579, in save_
self._save_post(message, chain(*status_state.values()), need_partial_commit, amend=amend,
File "<repo_path>/venv/lib/python3.8/site-packages/datalad/support/annexrepo.py", line 3556, in _save_post
super(AnnexRepo, self)._save_post(
File "<repo_path>/venv/lib/python3.8/site-packages/datalad/support/gitrepo.py", line 3331, in _save_post
GitRepo.commit(
File "<repo_path>/venv/lib/python3.8/site-packages/datalad/support/gitrepo.py", line 1449, in commit
_ = self._call_git(
File "<repo_path>/venv/lib/python3.8/site-packages/datalad/dataset/gitrepo.py", line 398, in _call_git
for file_no, line in self._generator_call_git(args,
File "<repo_path>/venv/lib/python3.8/site-packages/datalad/dataset/gitrepo.py", line 355, in _generator_call_git
for file_no, content in generator:
File "<repo_path>/venv/lib/python3.8/site-packages/datalad/runner/gitrunner.py", line 299, in run_on_filelist_chunks_items_
for chunk_generator in self._get_chunked_results(cmd=cmd,
File "<repo_path>/venv/lib/python3.8/site-packages/datalad/runner/gitrunner.py", line 184, in _get_chunked_results
yield self.run(
File "<repo_path>/venv/lib/python3.8/site-packages/datalad/runner/runner.py", line 206, in run
results_or_iterator = threaded_runner.run()
File "<repo_path>/venv/lib/python3.8/site-packages/datalad/runner/nonasyncrunner.py", line 343, in run
return self._locked_run()
File "<repo_path>/venv/lib/python3.8/site-packages/datalad/runner/nonasyncrunner.py", line 403, in _locked_run
self.process = Popen(self.cmd, **kwargs) # nosec
File "/usr/lib/python3.8/subprocess.py", line 858, in __init__
self._execute_child(args, executable, preexec_fn, close_fds,
File "/usr/lib/python3.8/subprocess.py", line 1704, in _execute_child
raise child_exception_type(errno_num, err_msg, err_filename)
OSError: [Errno 7] Argument list too long: 'git'
TaskFailed - taskid:read_new_unfccc_crf_for_year
Command failed: './venv/bin/python UNFCCC_GHG_data/UNFCCC_CRF_reader/read_new_UNFCCC_CRF_for_year_datalad.py --submission_year=2022 --re_read' returned 1
read_new_unfccc_crf_for_year does not commit it's results because of the error
OSError: [Errno 7] Argument list too long: 'git'
This is probably from the very long list of files that are affected by the commit and passed to datalad. To get rid of this message we could commit country by country or dig deeper and find another solution. As manually committing after the error in the script work fine it's not urgent to solve this issue.