webrecorder / pywb

Core Python Web Archiving Toolkit for replay and recording of web archives
https://pypi.python.org/pypi/pywb
GNU General Public License v3.0
1.34k stars 207 forks source link

Allow for catch-all wildcard *, in ACLs #881

Closed tw4l closed 3 months ago

tw4l commented 4 months ago

Is your feature request related to a problem? Please describe.

Currently, if a user wants to set default_access: block but allow particular users to have access to any URL, there is no way to do this other than enumerating all of the possible TLDs in their collection in an ACLJ file, e.g.:

com, - {"access": "allow", "user": "staff"}
org, - {"access": "allow", "user": "staff"}
...

Describe the solution you'd like

I'd like a user to be able to specify an ACL rule matching any possible URL, with the following syntax:

*, - {"access": "allow", "user": "staff"}

Describe alternatives you've considered

Additional context

This originally came in as a user request on IIPC Slack.

ptrourke commented 3 months ago

Testing this version, my podman container is failing, and I'm seeing this error:

*** Operational MODE: async ***
mounting /pywb/pywb/apps/wayback.py on /playback-services
Traceback (most recent call last):
  File "/pywb/pywb/apps/wayback.py", line 1, in <module>
    from gevent.monkey import patch_all; patch_all()
  File "/usr/local/lib/python3.8/site-packages/gevent/__init__.py", line 86, in <module>
    from gevent._hub_local import get_hub
  File "/usr/local/lib/python3.8/site-packages/gevent/_hub_local.py", line 101, in <module>
    import_c_accel(globals(), 'gevent.__hub_local')
  File "/usr/local/lib/python3.8/site-packages/gevent/_util.py", line 148, in import_c_accel
    mod = importlib.import_module(cname)
  File "/usr/local/lib/python3.8/importlib/__init__.py", line 127, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "src/gevent/_hub_local.py", line 1, in init gevent._gevent_c_hub_local
ValueError: greenlet.greenlet size changed, may indicate binary incompatibility. Expected 152 from C header, got 40 from PyObject

(playback-services is the name of our deployment for PyWB-based playback)

This seems to be a version of this error? https://github.com/python-greenlet/greenlet/issues/178

Reverting to 2.7.4 has resolved the issue.

tw4l commented 3 months ago

Thanks for pointing this out @ptrourke ! We'll be turning our attention to pywb issues in the next sprint and should be able to resolve this and the failing tests on main then.

tw4l commented 3 months ago

Hi @ptrourke, looks like this was due to the incorrect version of greenlet being used with gevent - there were a number of dependencies that needed updating that are getting resolved in https://github.com/webrecorder/pywb/pull/839, once that's merged into main and we rebase this branch you shouldn't run into that issue anymore and the tests should be good again.