schollz / find3

High-precision indoor positioning framework, version 3.
https://www.internalpositioning.com/doc
MIT License
4.65k stars 366 forks source link

Memory leak in server? Also - api/v1/by_location is very slow and CPU/memory intensive? #8

Closed victorhooi closed 6 years ago

victorhooi commented 6 years ago

I have a find3 server running on a Google Compute Engine instance - n1-standard-1 (1 vCPU, 3.75 GB memory).

I have five Raspberry PIs that are scanning, and sending fingerprints to this server. I have done training, and added around 8 locations.

I'm using docker stats find3server to track memory usage of find3server.

The memory usage starts out at around 193 MB. However, there seem to be two issues.

Firstly, over time the Docker instance appears to consume memory (and not release it), and eventually consumes all machine and crashes the host machine.

Secondly, if you call the http://server:8003/api/v1/by_location/family_name endpoint, this seems to accelerate the process - it seems to add a few hundred MB each time.

For example - initial memory:

screen shot 2018-03-04 at 10 39 31

Then, calling http GET SERVER_IP_ADDRESS:8003/api/v1/by_location/gcc - new screenshot of docker stats find3server after each call:

screen shot 2018-03-04 at 10 42 35 screen shot 2018-03-04 at 10 45 05 screen shot 2018-03-04 at 11 04 57 screen shot 2018-03-04 at 11 30 11 screen shot 2018-03-04 at 11 46 26 screen shot 2018-03-04 at 12 02 22

However, even just leaving the machine alone, the memory usage climbs on its own.

Also, each request against api/v1/by_location/<family> takes longer and longer.

I had to change the httpie timeout from the default of 30 seconds, to 5 minutes, then 10 minutes, to let it complete. After around 5 calls, it would no longer complete even after 10 minutes.

Also - even after restarting the entire machine, and starting find3server up again - api/v1/by_location still won't complete within 10 minutes =(.

schollz commented 6 years ago

I'm almost certain this has to do with my Naive bayes implementation. Can you send me your database file to checkout ? You can put in the slack channel. (Its in the /data directory with some name like *.sqlite3.db)

victorhooi commented 6 years ago

Data file is attached;

bjAa.sqlite3.db.zip

victorhooi commented 6 years ago

I tried with the latest git pull, after some of the recent Naive Bayes changes:

https://github.com/schollz/find3/commit/7d00ee4568ac7ff5ea43a89058a409ecf8224f89 https://github.com/schollz/find3/commit/591c2dc903a6fb9e793f85a883e069d211d108a6

Memory usage seems slightly better - but still very high, and doesn't release:

CONTAINER           CPU %               MEM USAGE / LIMIT    MEM %               NET I/O             BLOCK I/O           PIDS
find3server_newml   1.56%               1015MiB / 3.607GiB   27.49%              1.1MB / 448kB       63.1MB / 771MB      28

Also - the command still times out after 10 minutes:

victorhooi@victorhooi-wiki:~$ http --timeout=600 GET localhost:8003/api/v1/by_location/gcc >> ~/gcc_results

http: error: Request timed out (600.0s).

Anything else I should try?

schollz commented 6 years ago

This should be fixed now, as I found the error was from not closing loggers when generating them.

cpressman commented 4 years ago

I notice this was closed 2 years ago but I have been experiencing something very similar. When I go into the docker container using docker exec -ti find3server /bin/bash and then running top I can see that flask memory is growing substantially (30% ) and main as well (10%). This continues to happen until restarting the container frees the memory.