Open weiglemc opened 7 years ago
Here's the screenshot of my Security & Privacy System Prefs panel:
I am getting asked this repeatedly, @N0taN3rd after allowing each time in version 1.1.0b2.5. This is happening even when WAIL is in the background and crawls are running. I presume there is some background procedure communication with pywb that is invoking the wayback binary. Any idea why the "allow" is not sticking? macOS 10.12.4
@machawk1 That is a really good question and one I have not been able to put my thumb on concretely. On the WAIL side all network requests are either made using the Node.js built in http/https libraries (when drilling down into the libraries used code) or Electrons Chromium when initiating the single page crawl. Libraries used by WAIL currently that make network requests:
So the request for network permissions for WAIL covers these two and eliminates it.
Heritrix makes many network requests but it is Java based and when launched is run by the 1.7 JVM. Only compilation done is the JIT of the class files in Heritrix's Jar by the JVM. So Heritrix plays nice due to the JVM and how Java applications are designed thus allowing for the network permissions to stick.
Pywb, i.e wayback binary. Now this guy is interesting because its usage depends on the output of pyinstaller and how it links the executibles to the packaged Python runtime. Pyinstaller packages all the .so/.dylib etc files that are used by the python version used by the compiling machine. From the pyinstaller docs
First process: bootloader starts.
- If one-file mode, extract bundled files to temppath
_MEI
xxxxxx- Modify various environment variables:
- Linux: save original value of LD_LIBRARY_PATH into LD_LIBRARY_PATH_ORIG, prepend our path to LD_LIBRARY_PATH.
- AIX: same thing, but using LIBPATH and LIBPATH_ORIG.
- OSX: unset DYLD_LIBRARY_PATH.
- Set up to handle signals for both processes.
- Run the child process.
- Wait for the child process to finish.
- If one-file mode, delete temppath
_MEI
xxxxxx.
Second process: bootloader itself started as a child process.
- On Windows set the activation context.
- Load the Python dynamic library. The name of the dynamic library is embedded in the executable file.
- Initialize Python interpreter: set sys.path, sys.prefix, sys.executable.
- Run python code.
Running Python code requires several steps:
frozen
and _MEIPASS
to the sys
built-in module../eggs
directory.
Installing means appending .egg file names to sys.path
.
Python automatically detects whether an
item in sys.path
is a zip file or a directory.As you can see each time a pyinstaller "installed" program is run it sets up a unique python vm if you will. Now pywb uses wsgi server which does its own thing. This is where I believe the issue is originating from. Due to the internals of pywb and how pyinstaller boot-straps everything the "application/process" unique identifier for the permission is not found each time pywb is launched or wsgi internals spin up a new connection handler etc. WAIL repeatably restarts it due to its inability to dynamically know about new collections without re-starting.
Maybe this is a function of my Mac security settings, but every time a tweet is archived, I get the attached security pop-up.
It may be that all we need to do to address this is add some documentation on how to adjust the settings.