SuffolkLITLab / FormFyxer

A tool for learning about and pre-processing forms
MIT License
11 stars 1 forks source link

Add sentence count, passive voice and citations #70

Closed nonprofittechy closed 2 years ago

nonprofittechy commented 2 years ago

fix #60 fix #62 fix #65

Adds several more stats. I was testing by installing on the apps-dev server--this link will work to see the new stats: https://apps-dev.suffolklitlab.org/pdfstats/view/011eafd5f4867f668fb51239141443c6ae5f20c453826cd898ce91cd.pdf

nonprofittechy commented 2 years ago

Per Jonathan:

The standard uwsgi setup is one process, six threads (cores).
I would recommend loading your module from a module that is pre-loaded, so that the import happens when the web application (re)starts.
However, even if you loaded the module with an imports block in an interview, that might not make a difference, because as far as I can tell by doing experiments, module loading that happens during an HTTP request seems to affect the whole process and not just a single thread within a process.
The uwsgi process loads docassemble.webapp.run, which contains:
from docassemble.webapp.server import app as application
if name == "main":
application.run(host='0.0.0.0')
During the first line, docassemble add-on modules are imported, because this line runs all the code in docassemble.webapp.server, and that includes the import_necessary() function. So the importing of add-on modules happens before application.run(), which is what actually runs the Flask loop that listens for connections.
How memory gets handled is something I don't fully understand. I have heard that memory is shared until it is modified, after which there is a separate copy. But I don't know if that applies to process forking and not threads.
nonprofittechy commented 2 years ago

This is ready to merge (no longer balloons memory usage), although we're going to want to make a lot of changes to the "time to answer" stat