daniel-j-h / libosrmc

Pure C bindings for libosrm
MIT License
18 stars 8 forks source link

Memory leak #23

Open ZhiWeiCui opened 3 years ago

ZhiWeiCui commented 3 years ago

Hi:

Thank you so much for providing an amazing package.

Use this package according to the method you said. When the route method is executed multiple times, it is found that the memory keeps increasing. I don't know if you have encountered this situation.

Looking forward to your reply!

daniel-j-h commented 3 years ago

You need to call the _destruct functions to release memory, see

ZhiWeiCui commented 3 years ago

Hi: Thank you very much for your reply! I used the example from GeographicaGS/libosrmc by python3. I saw this method in osrmcpy.py, but I'm not sure.

https://github.com/daniel-j-h/libosrmc/blob/38e6af24c0389854eea8065235e626b608ff8b32/bindings/osrmcpy.py#L153-L157

My test code is as follows. When I put _router = OSRM(OSRMDATASET.encode('utf-8'), contraction=True) in the for loop, the memory is stable. I don't want to instantiate OSRM in every loop based on performance. Would you like to ask if I have other options ?

from pyosrm import PyOSRM, Status
from dpsolver.Util import randomly_location
import subprocess, os, re

from osrmcpy import OSRM, Coordinate

def convert_size(size):
    if size <1024:
        return size
    elif (size >= 1024) and (size < (1024 * 1024)):
        return "%.2f KB"%(size/1024)
    elif (size >= (1024*1024)) and (size < (1024*1024*1024)):
        return "%.2f MB"%(size/(1024*1024))
    else:
        return "%.2f GB"%(size/(1024*1024*1024))

def process_info():
    pid = os.getpid()
    res = subprocess.getstatusoutput('ps aux|grep ' + str(pid))[1].split('\n')[0]

    p = re.compile(r'\s+')
    l = p.split(res)
    info = {'user': l[0],
            'pid': l[1],
            'cpu': l[2],
            'mem': l[3],
            'vsa': convert_size(int(l[4])*1024),
            'rss': convert_size(int(l[5])*1024),
            'start_time': l[6]}
    return info

# location
location_start = randomly_location(10000)
location_end = randomly_location(10000)

# osrmcpy
DATA_DIR = '/mnt/d/Data/osrm/bicycle/ch'
OSRM_DATASET = os.path.join(DATA_DIR, 'china-latest.osrm')
router = OSRM(OSRM_DATASET.encode('utf-8'), contraction=True)

# pyosrm
# router = PyOSRM(use_shared_memory=True, algorithm='CH')

for start, end in zip(location_start.values(), location_end.values()):
    result = router.route([start[::-1], end[::-1]])
    print(process_info())
daniel-j-h commented 3 years ago

I don't know about the geographica fork, looks like they implement some more of the api surface. Maybe ask there.

The shared memory flag could be the issue here, after all these hears I don't recall anymore off the top of my head how it works but if it's using mmap under the hood there's a high chance your kernel dynamically walks the page-sized chunks and loads them into memory; try without.

On June 4, 2021 2:03:32 AM UTC, ZhiWei Cui @.***> wrote:

Hi: Thank you very much for your reply! I used the example from GeographicaGS/libosrmc by python3. I saw this method in osrmcpy.py, but I'm not sure.

https://github.com/daniel-j-h/libosrmc/blob/38e6af24c0389854eea8065235e626b608ff8b32/bindings/osrmcpy.py#L153-L157

My test code is as follows. When I put _router = OSRM(OSRMDATASET.encode('utf-8'), contraction=True) in the for loop, the memory is stable. I don't want to instantiate OSRM in every loop based on performance. Would you like to ask if I have other options ?

from pyosrm import PyOSRM, Status from dpsolver.Util import randomly_location import subprocess, os, re

from osrmcpy import OSRM, Coordinate

def convert_size(size): if size <1024: return size elif (size >= 1024) and (size < (1024 1024)): return "%.2f KB"%(size/1024) elif (size >= (10241024)) and (size < (102410241024)): return "%.2f MB"%(size/(10241024)) else: return "%.2f GB"%(size/(10241024*1024))

def process_info(): pid = os.getpid() res = subprocess.getstatusoutput('ps aux|grep ' + str(pid))[1].split('\n')[0]

   p = re.compile(r'\s+')
   l = p.split(res)
   info = {'user': l[0],
           'pid': l[1],
           'cpu': l[2],
           'mem': l[3],
           'vsa': convert_size(int(l[4])*1024),
           'rss': convert_size(int(l[5])*1024),
           'start_time': l[6]}
   return info

location

location_start = randomly_location(10000) location_end = randomly_location(10000)

osrmcpy

DATA_DIR = '/mnt/d/Data/osrm/bicycle/ch' OSRM_DATASET = os.path.join(DATA_DIR, 'china-latest.osrm') router = OSRM(OSRM_DATASET.encode('utf-8'), contraction=True)

pyosrm

router = PyOSRM(use_shared_memory=True, algorithm='CH')

for start, end in zip(location_start.values(), location_end.values()): result = router.route([start[::-1], end[::-1]])

result = router.route([start[::-1], end[::-1]])

   print(process_info())
ZhiWeiCui commented 3 years ago

Thank you again for your suggestion, I will consider it carefully !