jjjake / internetarchive

A Python and Command-Line Interface to Archive.org
GNU Affero General Public License v3.0
1.62k stars 219 forks source link

Random fails "Header must be of type str or bytes, not <class 'list'>" #661

Closed maaaaz closed 10 hours ago

maaaaz commented 4 days ago

Hello @jjjake,

I would like to report a bug that I can't reproduce but which happens now often, for unknown reasons. While doing a ia upload -q <item> <files glob pattern such as *.mp4> I get:

Traceback (most recent call last):
  File "/usr/lib/python3.10/runpy.py", line 196, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/usr/lib/python3.10/runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "/root/.pex/unzipped_pexes/6ad789d32c6e0b8ba03aafb7cd65770c6221aab6/__main__.py", line 106, in <module>
    bootstrap_pex(__entry_point__, execute=__execute__, venv_dir=__venv_dir__)
  File "/root/.pex/unzipped_pexes/6ad789d32c6e0b8ba03aafb7cd65770c6221aab6/.bootstrap/pex/pex_bootstrapper.py", line 627, in bootstrap_pex
    pex.PEX(entry_point).execute()
  File "/root/.pex/unzipped_pexes/6ad789d32c6e0b8ba03aafb7cd65770c6221aab6/.bootstrap/pex/pex.py", line 562, in execute
    sys.exit(self._wrap_coverage(self._wrap_profiling, self._execute))
  File "/root/.pex/unzipped_pexes/6ad789d32c6e0b8ba03aafb7cd65770c6221aab6/.bootstrap/pex/pex.py", line 469, in _wrap_coverage
    return runner(*args)
  File "/root/.pex/unzipped_pexes/6ad789d32c6e0b8ba03aafb7cd65770c6221aab6/.bootstrap/pex/pex.py", line 500, in _wrap_profiling
    return runner(*args)
  File "/root/.pex/unzipped_pexes/6ad789d32c6e0b8ba03aafb7cd65770c6221aab6/.bootstrap/pex/pex.py", line 606, in _execute
    return self.execute_entry(
  File "/root/.pex/unzipped_pexes/6ad789d32c6e0b8ba03aafb7cd65770c6221aab6/.bootstrap/pex/pex.py", line 808, in execute_entry
    return self.execute_entry_point(entry_point)
  File "/root/.pex/unzipped_pexes/6ad789d32c6e0b8ba03aafb7cd65770c6221aab6/.bootstrap/pex/pex.py", line 826, in execute_entry_point
    return runner()
  File "/root/.pex/installed_wheels/16f3182add31f68bdec1049888e23e9df12d5311e9179f8db0f3ea25efd8eaed/internetarchive-5.0.1-py3-none-any.whl/internetarchive/cli/ia.py", line 144, in main
    args.func(args)
  File "/root/.pex/installed_wheels/16f3182add31f68bdec1049888e23e9df12d5311e9179f8db0f3ea25efd8eaed/internetarchive-5.0.1-py3-none-any.whl/internetarchive/cli/ia_upload.py", line 132, in <lambda>
    parser.set_defaults(func=lambda args: main(args, parser))
  File "/root/.pex/installed_wheels/16f3182add31f68bdec1049888e23e9df12d5311e9179f8db0f3ea25efd8eaed/internetarchive-5.0.1-py3-none-any.whl/internetarchive/cli/ia_upload.py", line 308, in main
    for _r in _upload_files(item, files, upload_kwargs):
  File "/root/.pex/installed_wheels/16f3182add31f68bdec1049888e23e9df12d5311e9179f8db0f3ea25efd8eaed/internetarchive-5.0.1-py3-none-any.whl/internetarchive/cli/ia_upload.py", line 148, in _upload_files
    response = item.upload(files, **upload_kwargs)
  File "/root/.pex/installed_wheels/16f3182add31f68bdec1049888e23e9df12d5311e9179f8db0f3ea25efd8eaed/internetarchive-5.0.1-py3-none-any.whl/internetarchive/item.py", line 1283, in upload
    resp = self.upload_file(body,
  File "/root/.pex/installed_wheels/16f3182add31f68bdec1049888e23e9df12d5311e9179f8db0f3ea25efd8eaed/internetarchive-5.0.1-py3-none-any.whl/internetarchive/item.py", line 1087, in upload_file
    prepared_request = request.prepare()
  File "/root/.pex/installed_wheels/16f3182add31f68bdec1049888e23e9df12d5311e9179f8db0f3ea25efd8eaed/internetarchive-5.0.1-py3-none-any.whl/internetarchive/iarequest.py", line 67, in prepare
    p.prepare(
  File "/root/.pex/installed_wheels/16f3182add31f68bdec1049888e23e9df12d5311e9179f8db0f3ea25efd8eaed/internetarchive-5.0.1-py3-none-any.whl/internetarchive/iarequest.py", line 93, in prepare
    self.prepare_headers(headers, metadata,
  File "/root/.pex/installed_wheels/16f3182add31f68bdec1049888e23e9df12d5311e9179f8db0f3ea25efd8eaed/internetarchive-5.0.1-py3-none-any.whl/internetarchive/iarequest.py", line 167, in prepare_headers
    super().prepare_headers(headers)
  File "/root/.pex/installed_wheels/2f57f4ae5920a2f5f55a8f4971fe010b00cb2a30f8ab9c572826150fa309c450/requests-2.32.3-py3-none-any.whl/requests/models.py", line 490, in prepare_headers
    check_header_validity(header)
  File "/root/.pex/installed_wheels/2f57f4ae5920a2f5f55a8f4971fe010b00cb2a30f8ab9c572826150fa309c450/requests-2.32.3-py3-none-any.whl/requests/utils.py", line 1042, in check_header_validity
    _validate_header_part(header, value, 1)
  File "/root/.pex/installed_wheels/2f57f4ae5920a2f5f55a8f4971fe010b00cb2a30f8ab9c572826150fa309c450/requests-2.32.3-py3-none-any.whl/requests/utils.py", line 1051, in _validate_header_part
    raise InvalidHeader(
requests.exceptions.InvalidHeader: Header part (['Internet Archive Python library 5.0.1', 'Internet Archive Python library 5.0.1']) from ('x-archive-meta00-scanner', ['Internet Archive Python library 5.0.1', 'Internet Archive Python library 5.0.1']) must be of type str or bytes, not <class 'list'>

I don't know if it comes from the <files glob pattern such as *.mp4> or something else.

Cheers!

jjjake commented 2 days ago

Sorry for the trouble @maaaaz. I'm seeing similar reports elsewhere (https://github.com/jjjake/internetarchive/pull/662 for example).

Please let me know if you're able to reproduce it or provide any other info. I'll try to figure out what's going on here ASAP.

maaaaz commented 11 hours ago

Hey @jjjake, in the release notes of the v5.0.3 I can see: Fixed bug where InvalidHeader was being raised when a custom scanner was provided in some cases.

Is the fix related to this issue ?

Anyway, thank you !

jjjake commented 10 hours ago

@maaaaz Yes, my apologies -- I meant to follow up here! Please let me know if you run into any other issues. Thanks again for the report.