Open jdbalistreri opened 6 years ago
Hi,
I've got the same issue in the same conditions. Here is a partial stack trace for the error:
ERR Averaged perceptron model could not be loaded
at address_parser_load (address_parser.c:205) errno: Cannot allocate memory
ERR Error loading address parser module, dir=(null)
at libpostal_setup_parser_datadir (libpostal.c:410) errno: Cannot allocate memory
Traceback (most recent call last):
[... private stack ...]
File "/usr/local/pyenv/versions/2.7.12/envs/api/lib/python2.7/site-packages/postal/parser.py", line 2, in <module>
from postal import _parser
TypeError: Error loading libpostal data
This is a known bug https://github.com/openvenues/libpostal/issues/378
I have had success with this patch https://github.com/openvenues/libpostal/issues/351#issuecomment-402778530
According to this comment, libpostal will need 1.8GB of memory. I tried a t2.small with Ubuntu 18.0 LTS but failed. I ended up using a t2.medium for my Rails app and it works just fine.
I've built a lambda function written in javascript and put it in a docker image, but it requires 4gb of ram to run and requires dedicated concurrency to eliminate cold starts, and it costs around $50 a month + calls. But yeah, otherwise need a medium EC2 instance.
Hi!
I have a question regarding best practices for using libpostal in a production environment.
My country is
United States
Here's how I'm using libpostal
I am running an alumni directory SAAS product. Sometimes we get address fields as one big string and need to parse them into individual components.
Here's what I did
I installed libpostal and pypostal for use in my Flask app. Things worked great on my local machine, but I ran into difficulty when deploying to AWS Elastic Beanstalk. Eventually, I got it installed and ran into a error on my EC2 indicating that there was not enough memory available to load the library (unfortunately, I don't have a paste of the exact error atm). I doubled the amount of RAM on my box (from 4GB to 8GB). With a larger amount of RAM, I would still run into the out of memory error, though not every time.
What is the recommended way to use libpostal in a production environment? Am I correct in installing it on each of my application servers in production? If so, what is the minimum amount of recommended memory that a box should have? (the docs say libpostal requires 1.8GB of memory to run). I also noticed that my application was significantly slower when libpostal was running on the same box where my server was running - is that expected?
If I'm doing all of the above correctly, another theory of mine is that somehow there are multiple processes initiating libpostal each each initiation is taking 1.8GB of memory. Is this possible based on how libpostal is implemented? If so, how could I investigate this possibility?
Thanks, Joe
Here's what I got
I don't have the exact error right now.
Here's what I was expecting
n/a
For parsing issues, please answer "yes" or "no" to all that apply.
n/a
Here's what I think could be improved
n/a