Open dDArchivist opened 5 years ago
Interesting, it looks like you are on Windows? If you clone the repo and run the tests do they pass?
git clone https://github.com/libraryofcongress/bagit-python.git
cd bagit-python
python setup.py test
I wonder if this difference in behavior of multiprocessing on Windows might help explain what's going on?
Hi. Thanks for taking a look. I followed your directions by using the bash emulator in Git for Windows. I'm attaching a sample.
It looks like the test suite is noticing problems too. We should really be running thr tests on windows regularly as part of our builds.
Yeah, back in the day there wasn't a great Windows CI option but there are several now. In addition to Travis, I know bagit-java uses https://www.appveyor.com/ and Azure Pipelines announced a free tier for open-source: https://azure.microsoft.com/en-us/blog/announcing-azure-pipelines-with-unlimited-ci-cd-minutes-for-open-source/
Same issue here. I will also suggest it looks like python setup.py test
vs. tox -e py38
vs. the primary issue in this ticket look like different issues. Though I can recreate those on Windows 11. I need more time to have a look but I'll attach some of the multiprocessing error log below.
NB. Truncated as it just loops infinitely / or for a very long time I'm not keen to find out about.
2022-05-12 13:36:29,225 - INFO - Creating bag for directory C:\temp\testing\bags\govdoc-gifs
2022-05-12 13:36:29,229 - INFO - Creating data directory
2022-05-12 13:36:29,235 - INFO - Moving bag-info.txt to C:\temp\testing\bags\govdoc-gifs\tmp7t7u69he\bag-info.txt
2022-05-12 13:36:29,235 - INFO - Moving bagit.txt to C:\temp\testing\bags\govdoc-gifs\tmp7t7u69he\bagit.txt
2022-05-12 13:36:29,236 - INFO - Moving data to C:\temp\testing\bags\govdoc-gifs\tmp7t7u69he\data
2022-05-12 13:36:29,237 - INFO - Moving manifest-sha256.txt to C:\temp\testing\bags\govdoc-gifs\tmp7t7u69he\manifest-sha256.txt
2022-05-12 13:36:29,238 - INFO - Moving manifest-sha512.txt to C:\temp\testing\bags\govdoc-gifs\tmp7t7u69he\manifest-sha512.txt
2022-05-12 13:36:29,239 - INFO - Moving tagmanifest-sha256.txt to C:\temp\testing\bags\govdoc-gifs\tmp7t7u69he\tagmanifest-sha256.txt
2022-05-12 13:36:29,239 - INFO - Moving tagmanifest-sha512.txt to C:\temp\testing\bags\govdoc-gifs\tmp7t7u69he\tagmanifest-sha512.txt
2022-05-12 13:36:29,240 - INFO - Moving C:\temp\testing\bags\govdoc-gifs\tmp7t7u69he to data
2022-05-12 13:36:29,241 - INFO - Using 2 processes to generate manifests: sha256, sha512
Traceback (most recent call last):
Traceback (most recent call last):
File "C:\Users\Spencer\Apps\python\lib\site-packages\pkg_resources\_vendor\packaging\requirements.py", line 98, in __init__
File "C:\Users\Spencer\Apps\python\lib\site-packages\pkg_resources\_vendor\packaging\requirements.py", line 98, in __init__
req = REQUIREMENT.parseString(requirement_string)
File "C:\Users\Spencer\Apps\python\lib\site-packages\pkg_resources\_vendor\pyparsing.py", line 1654, in parseString
req = REQUIREMENT.parseString(requirement_string)
File "C:\Users\Spencer\Apps\python\lib\site-packages\pkg_resources\_vendor\pyparsing.py", line 1654, in parseString
raise exc
raise exc
File "C:\Users\Spencer\Apps\python\lib\site-packages\pkg_resources\_vendor\pyparsing.py", line 1644, in parseString
File "C:\Users\Spencer\Apps\python\lib\site-packages\pkg_resources\_vendor\pyparsing.py", line 1644, in parseString
loc, tokens = self._parse( instring, 0 )
loc, tokens = self._parse( instring, 0 )
File "C:\Users\Spencer\Apps\python\lib\site-packages\pkg_resources\_vendor\pyparsing.py", line 1402, in _parseNoCache
File "C:\Users\Spencer\Apps\python\lib\site-packages\pkg_resources\_vendor\pyparsing.py", line 1402, in _parseNoCache
loc,tokens = self.parseImpl( instring, preloc, doActions )
File "C:\Users\Spencer\Apps\python\lib\site-packages\pkg_resources\_vendor\pyparsing.py", line 3417, in parseImpl
loc,tokens = self.parseImpl( instring, preloc, doActions )
File "C:\Users\Spencer\Apps\python\lib\site-packages\pkg_resources\_vendor\pyparsing.py", line 3417, in parseImpl
loc, exprtokens = e._parse( instring, loc, doActions )
loc, exprtokens = e._parse( instring, loc, doActions )
File "C:\Users\Spencer\Apps\python\lib\site-packages\pkg_resources\_vendor\pyparsing.py", line 1402, in _parseNoCache
File "C:\Users\Spencer\Apps\python\lib\site-packages\pkg_resources\_vendor\pyparsing.py", line 1402, in _parseNoCache
loc,tokens = self.parseImpl( instring, preloc, doActions )
File "C:\Users\Spencer\Apps\python\lib\site-packages\pkg_resources\_vendor\pyparsing.py", line 3739, in parseImpl
loc,tokens = self.parseImpl( instring, preloc, doActions )
File "C:\Users\Spencer\Apps\python\lib\site-packages\pkg_resources\_vendor\pyparsing.py", line 3739, in parseImpl
return self.expr._parse( instring, loc, doActions, callPreParse=False )
File "C:\Users\Spencer\Apps\python\lib\site-packages\pkg_resources\_vendor\pyparsing.py", line 1402, in _parseNoCache
return self.expr._parse( instring, loc, doActions, callPreParse=False )
File "C:\Users\Spencer\Apps\python\lib\site-packages\pkg_resources\_vendor\pyparsing.py", line 1402, in _parseNoCache
loc,tokens = self.parseImpl( instring, preloc, doActions )
File "C:\Users\Spencer\Apps\python\lib\site-packages\pkg_resources\_vendor\pyparsing.py", line 3400, in parseImpl
loc,tokens = self.parseImpl( instring, preloc, doActions )
File "C:\Users\Spencer\Apps\python\lib\site-packages\pkg_resources\_vendor\pyparsing.py", line 3400, in parseImpl
loc, resultlist = self.exprs[0]._parse( instring, loc, doActions, callPreParse=False )
File "C:\Users\Spencer\Apps\python\lib\site-packages\pkg_resources\_vendor\pyparsing.py", line 1406, in _parseNoCache
loc, resultlist = self.exprs[0]._parse( instring, loc, doActions, callPreParse=False )
File "C:\Users\Spencer\Apps\python\lib\site-packages\pkg_resources\_vendor\pyparsing.py", line 1406, in _parseNoCache
loc,tokens = self.parseImpl( instring, preloc, doActions )
File "C:\Users\Spencer\Apps\python\lib\site-packages\pkg_resources\_vendor\pyparsing.py", line 2711, in parseImpl
loc,tokens = self.parseImpl( instring, preloc, doActions )
File "C:\Users\Spencer\Apps\python\lib\site-packages\pkg_resources\_vendor\pyparsing.py", line 2711, in parseImpl
raise ParseException(instring, loc, self.errmsg, self)
pkg_resources._vendor.pyparsing.ParseException: Expected W:(abcd...) (at char 0), (line:1, col:1)
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "<string>", line 1, in <module>
raise ParseException(instring, loc, self.errmsg, self)
pkg_resources._vendor.pyparsing.ParseException: Expected W:(abcd...) (at char 0), (line:1, col:1)
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "C:\Users\Spencer\Apps\python\lib\multiprocessing\spawn.py", line 116, in spawn_main
File "C:\Users\Spencer\Apps\python\lib\multiprocessing\spawn.py", line 116, in spawn_main
exitcode = _main(fd, parent_sentinel)
File "C:\Users\Spencer\Apps\python\lib\multiprocessing\spawn.py", line 125, in _main
exitcode = _main(fd, parent_sentinel)
File "C:\Users\Spencer\Apps\python\lib\multiprocessing\spawn.py", line 125, in _main
prepare(preparation_data)
File "C:\Users\Spencer\Apps\python\lib\multiprocessing\spawn.py", line 234, in prepare
prepare(preparation_data)
File "C:\Users\Spencer\Apps\python\lib\multiprocessing\spawn.py", line 234, in prepare
_fixup_main_from_name(data['init_main_from_name'])
File "C:\Users\Spencer\Apps\python\lib\multiprocessing\spawn.py", line 258, in _fixup_main_from_name
_fixup_main_from_name(data['init_main_from_name'])
File "C:\Users\Spencer\Apps\python\lib\multiprocessing\spawn.py", line 258, in _fixup_main_from_name
main_content = runpy.run_module(mod_name,
File "C:\Users\Spencer\Apps\python\lib\runpy.py", line 210, in run_module
main_content = runpy.run_module(mod_name,
File "C:\Users\Spencer\Apps\python\lib\runpy.py", line 210, in run_module
return _run_module_code(code, init_globals, run_name, mod_spec)
File "C:\Users\Spencer\Apps\python\lib\runpy.py", line 97, in _run_module_code
return _run_module_code(code, init_globals, run_name, mod_spec)
File "C:\Users\Spencer\Apps\python\lib\runpy.py", line 97, in _run_module_code
_run_code(code, mod_globals, init_globals,
File "C:\Users\Spencer\Apps\python\lib\runpy.py", line 87, in _run_code
_run_code(code, mod_globals, init_globals,
File "C:\Users\Spencer\Apps\python\lib\runpy.py", line 87, in _run_code
exec(code, run_globals)
File "C:\Users\Spencer\Apps\python\lib\site-packages\bagit.py", line 52, in <module>
exec(code, run_globals)
File "C:\Users\Spencer\Apps\python\lib\site-packages\bagit.py", line 52, in <module>
VERSION = get_distribution(MODULE_NAME).version
File "C:\Users\Spencer\Apps\python\lib\site-packages\pkg_resources\__init__.py", line 464, in get_distribution
VERSION = get_distribution(MODULE_NAME).version
File "C:\Users\Spencer\Apps\python\lib\site-packages\pkg_resources\__init__.py", line 464, in get_distribution
dist = Requirement.parse(dist)
File "C:\Users\Spencer\Apps\python\lib\site-packages\pkg_resources\__init__.py", line 3139, in parse
dist = Requirement.parse(dist)
File "C:\Users\Spencer\Apps\python\lib\site-packages\pkg_resources\__init__.py", line 3139, in parse
req, = parse_requirements(s)
File "C:\Users\Spencer\Apps\python\lib\site-packages\pkg_resources\__init__.py", line 3084, in parse_requirements
req, = parse_requirements(s)
File "C:\Users\Spencer\Apps\python\lib\site-packages\pkg_resources\__init__.py", line 3084, in parse_requirements
yield Requirement(line)
File "C:\Users\Spencer\Apps\python\lib\site-packages\pkg_resources\__init__.py", line 3094, in __init__
yield Requirement(line)
File "C:\Users\Spencer\Apps\python\lib\site-packages\pkg_resources\__init__.py", line 3094, in __init__
super(Requirement, self).__init__(requirement_string)
File "C:\Users\Spencer\Apps\python\lib\site-packages\pkg_resources\_vendor\packaging\requirements.py", line 100, in __init__
super(Requirement, self).__init__(requirement_string)
File "C:\Users\Spencer\Apps\python\lib\site-packages\pkg_resources\_vendor\packaging\requirements.py", line 100, in __init__
raise InvalidRequirement(
pkg_resources.extern.packaging.requirements.InvalidRequirement: Parse error at "'__mp_mai'": Expected W:(abcd...)
raise InvalidRequirement(
pkg_resources.extern.packaging.requirements.InvalidRequirement: Parse error at "'__mp_mai'": Expected W:(abcd...)
Traceback (most recent call last):
File "C:\Users\Spencer\Apps\python\lib\site-packages\pkg_resources\_vendor\packaging\requirements.py", line 98, in __init__
req = REQUIREMENT.parseString(requirement_string)
File "C:\Users\Spencer\Apps\python\lib\site-packages\pkg_resources\_vendor\pyparsing.py", line 1654, in parseString
raise exc
File "C:\Users\Spencer\Apps\python\lib\site-packages\pkg_resources\_vendor\pyparsing.py", line 1644, in parseString
Traceback (most recent call last):
File "C:\Users\Spencer\Apps\python\lib\site-packages\pkg_resources\_vendor\packaging\requirements.py", line 98, in __init__
loc, tokens = self._parse( instring, 0 )
req = REQUIREMENT.parseString(requirement_string)
File "C:\Users\Spencer\Apps\python\lib\site-packages\pkg_resources\_vendor\pyparsing.py", line 1402, in _parseNoCache
File "C:\Users\Spencer\Apps\python\lib\site-packages\pkg_resources\_vendor\pyparsing.py", line 1654, in parseString
loc,tokens = self.parseImpl( instring, preloc, doActions )
File "C:\Users\Spencer\Apps\python\lib\site-packages\pkg_resources\_vendor\pyparsing.py", line 3417, in parseImpl
raise exc
File "C:\Users\Spencer\Apps\python\lib\site-packages\pkg_resources\_vendor\pyparsing.py", line 1644, in parseString
loc, tokens = self._parse( instring, 0 )
File "C:\Users\Spencer\Apps\python\lib\site-packages\pkg_resources\_vendor\pyparsing.py", line 1402, in _parseNoCache
loc, exprtokens = e._parse( instring, loc, doActions )
File "C:\Users\Spencer\Apps\python\lib\site-packages\pkg_resources\_vendor\pyparsing.py", line 1402, in _parseNoCache
loc,tokens = self.parseImpl( instring, preloc, doActions )
File "C:\Users\Spencer\Apps\python\lib\site-packages\pkg_resources\_vendor\pyparsing.py", line 3417, in parseImpl
loc,tokens = self.parseImpl( instring, preloc, doActions )
File "C:\Users\Spencer\Apps\python\lib\site-packages\pkg_resources\_vendor\pyparsing.py", line 3739, in parseImpl
loc, exprtokens = e._parse( instring, loc, doActions )
File "C:\Users\Spencer\Apps\python\lib\site-packages\pkg_resources\_vendor\pyparsing.py", line 1402, in _parseNoCache
return self.expr._parse( instring, loc, doActions, callPreParse=False )
File "C:\Users\Spencer\Apps\python\lib\site-packages\pkg_resources\_vendor\pyparsing.py", line 1402, in _parseNoCache
loc,tokens = self.parseImpl( instring, preloc, doActions )
File "C:\Users\Spencer\Apps\python\lib\site-packages\pkg_resources\_vendor\pyparsing.py", line 3739, in parseImpl
loc,tokens = self.parseImpl( instring, preloc, doActions )
File "C:\Users\Spencer\Apps\python\lib\site-packages\pkg_resources\_vendor\pyparsing.py", line 3400, in parseImpl
return self.expr._parse( instring, loc, doActions, callPreParse=False )
File "C:\Users\Spencer\Apps\python\lib\site-packages\pkg_resources\_vendor\pyparsing.py", line 1402, in _parseNoCache
loc, resultlist = self.exprs[0]._parse( instring, loc, doActions, callPreParse=False )
File "C:\Users\Spencer\Apps\python\lib\site-packages\pkg_resources\_vendor\pyparsing.py", line 1406, in _parseNoCache
loc,tokens = self.parseImpl( instring, preloc, doActions )
File "C:\Users\Spencer\Apps\python\lib\site-packages\pkg_resources\_vendor\pyparsing.py", line 3400, in parseImpl
loc,tokens = self.parseImpl( instring, preloc, doActions )
File "C:\Users\Spencer\Apps\python\lib\site-packages\pkg_resources\_vendor\pyparsing.py", line 2711, in parseImpl
loc, resultlist = self.exprs[0]._parse( instring, loc, doActions, callPreParse=False )
File "C:\Users\Spencer\Apps\python\lib\site-packages\pkg_resources\_vendor\pyparsing.py", line 1406, in _parseNoCache
raise ParseException(instring, loc, self.errmsg, self)
pkg_resources._vendor.pyparsing.ParseException: Expected W:(abcd...) (at char 0), (line:1, col:1)
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "<string>", line 1, in <module>
loc,tokens = self.parseImpl( instring, preloc, doActions )
File "C:\Users\Spencer\Apps\python\lib\site-packages\pkg_resources\_vendor\pyparsing.py", line 2711, in parseImpl
File "C:\Users\Spencer\Apps\python\lib\multiprocessing\spawn.py", line 116, in spawn_main
exitcode = _main(fd, parent_sentinel)
File "C:\Users\Spencer\Apps\python\lib\multiprocessing\spawn.py", line 125, in _main
raise ParseException(instring, loc, self.errmsg, self)
pkg_resources._vendor.pyparsing.ParseException: Expected W:(abcd...) (at char 0), (line:1, col:1)
Details -
(venv) C:\temp\testing\bags>python
Python 3.9.6 (tags/v3.9.6:db3ff76, Jun 28 2021, 15:26:21) [MSC v.1929 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
(venv) C:\temp\testing\bags>python -m pip freeze
bagit==1.8.1
Invocation (basically anything above 1
for the processes):
python -m bagit --processes 2 <folder_name>
I'm seeing the same behavior described by @ross-spencer with --processes
with bagit-python version 1.8.1, Python 3.10.9, on a Macbook with M2 chip.
I ran into this issue as well, running bagit-python version 1.8.1, Python 3.11.8 on Windows 10 and Windows Server 2019.
I think it does relate to differences in multiprocessing suggested by @edsu.
My limited understanding of the problem -
In the spawned pool processes, line 47 MODULE_NAME = "bagit" if __name__ == "__main__" else __name__
sets MODULE_NAME to __mp_main__
When this is passed to get_distribution(MODULE_NAME)
on line 52 it causes an exception, I guess because of the underscores in __mp_main__
.
Adding except InvalidRequirement:
to that try/except block gets rid of those errors, but the logging generated by spawned processes still doesn't work as expected. They're getting their own logger on line 49, and it doesn't have a logging.basicConfig() statement to set things up.
I think there might be a solution in logging.QueueHandler and logging.QueueListener, but haven't looked into it further.
When I attempt to run bagit.py with --processes option, I receive an enormous output of parsing exceptions. This does not occur when I do NOT invoke --processes. Is there is something wrong with my Python installation or is there something wrong with this component of the program?
I am running the Win 64 bit version of python 3.7.2 w/ a machine that uses 4 cores, 2 threads per core. I've attached a sample of the parsing exceptions output.
Thanks!
python_bagit_multiprocessing_error.txt