Open ChrisKeefe opened 2 years ago
@cherman2 helped diagnose this issue today. Thank you!
.qza
or .qzv
is a QIIME 2 ArchiveArchive Utility') adds a
_MACOSXdirectory when it zips things. Inside this directory are a collection of files named
.zip
command available from the MacOS terminal does not appear to do this__MACOSX
directory`. nemo
(and possibly other file browsers), the false .qzx
files in the resulting __MACOSX
directory break parsing with the confusing error message below. Unzipping in terminal with zip
seems to drop the directory.TLDR: this error only arises in cases where a mac user has zipped a collection of .qzx
in their file browser and sent it to a non-MacOS machine where it was unzipped (probably again in the file browser) for parsing.
chris:~/src/provenance_lib (main)> replay reproducibility-supplement --i-in-fp testfiles-ahhh\ \(1\)/ --o-out-fp whatever.zip --p-recurse
Parsing testfiles-ahhh (1)/testfiles-ahhh/multiplexed-seqs.qza
Parsing testfiles-ahhh (1)/testfiles-ahhh/newdir/demux-paired-end-ahhhh.qza
Parsing testfiles-ahhh (1)/__MACOSX/testfiles-ahhh/._multiplexed-seqs.qza
Traceback (most recent call last):
File "/home/chris/miniconda3/envs/q2-22.2/bin/replay", line 33, in <module>
sys.exit(load_entry_point('provenance-lib', 'console_scripts', 'replay')())
File "/home/chris/miniconda3/envs/q2-22.2/lib/python3.8/site-packages/click/core.py", line 829, in __call__
return self.main(*args, **kwargs)
File "/home/chris/miniconda3/envs/q2-22.2/lib/python3.8/site-packages/click/core.py", line 782, in main
rv = self.invoke(ctx)
File "/home/chris/miniconda3/envs/q2-22.2/lib/python3.8/site-packages/click/core.py", line 1259, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/home/chris/miniconda3/envs/q2-22.2/lib/python3.8/site-packages/click/core.py", line 1066, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/home/chris/miniconda3/envs/q2-22.2/lib/python3.8/site-packages/click/core.py", line 610, in invoke
return callback(*args, **kwargs)
File "/home/chris/src/provenance_lib/provenance_lib/click_commands.py", line 184, in reproducibility_supplement
write_reproducibility_supplement(
File "/home/chris/src/provenance_lib/provenance_lib/replay.py", line 722, in write_reproducibility_supplement
dag = ProvDAG(artifact_data=payload, validate_checksums=validate_checksums,
File "/home/chris/src/provenance_lib/provenance_lib/parse.py", line 108, in __init__
parser_results = parse_provenance(cfg, artifact_data)
File "/home/chris/src/provenance_lib/provenance_lib/parse.py", line 443, in parse_provenance
return parser.parse_prov(cfg, payload)
File "/home/chris/src/provenance_lib/provenance_lib/parse.py", line 382, in parse_prov
with zipfile.ZipFile(archive) as zf:
File "/home/chris/miniconda3/envs/q2-22.2/lib/python3.8/zipfile.py", line 1269, in __init__
self._RealGetContents()
File "/home/chris/miniconda3/envs/q2-22.2/lib/python3.8/zipfile.py", line 1336, in _RealGetContents
raise BadZipFile("File is not a zip file")
zipfile.BadZipFile: File is not a zip file
When replaying an extracted zip archive of files that was compressed on OSX, parsing fails because the included
__MACOSX
directory contains non-zipfiles named._something.qz*
:We need to:
__MACOSX
in the fp, removing__MACOSX
, or catchingBadZipFile("File is not a zip file")
errors and looking more closely at them.