wannesm / LeuvenMapMatching

Leuven.MapMatching toolbox for aligning GPS measurements to locations on a map.
Other
221 stars 42 forks source link

Error in Loading Serialized Map Data with InMemMap.deserialize #50

Open SheriffRabbit opened 5 months ago

SheriffRabbit commented 5 months ago

https://stackoverflow.com/questions/77878622/error-in-loading-serialized-map-data-with-inmemmap-deserialize

I am attempting to load map data from a pickle file using map_connn = InMemMap.deserialize(map_conn). After exporting the map data from memory with map_con.dump() as a dictionary, an error occurs:

TreeError: Error in "Index_CreateWithStream": Spatial Index Error: IllegalArgumentException: SpatialIndex::DiskStorageManager: Index/Data file cannot be created." 1.

map_con.dump() import pickle import os map_data_file = 'E:\jupyternotebook\共享单车路径匹配\map_con\myosm.pkl'

if os.path.exists(map_data_file): with open(map_data_file, 'rb') as f: map_conn = pickle.load(f) map_connn = InMemMap.deserialize(map_conn)


RTreeError Traceback (most recent call last) Cell In[5], line 1 ----> 1 map_connn = InMemMap.deserialize(map_conn)

File D:\miniconda3\envs\Map_Matching\lib\site-packages\leuvenmapmatching\map\inmem.py:144, in InMemMap.deserialize(cls, data) 141 @classmethod 142 def deserialize(cls, data): 143 """Create a new instance from a dictionary.""" --> 144 nmap = cls(data["name"], dir=data.get("dir", None), 145 use_latlon=data["use_latlon"], use_rtree=data["use_rtree"], 146 index_edges=data["index_edges"], 147 crs_lonlat=data.get("crs_lonlat", None), crs_xy=data.get("crs_xy", None), 148 graph=data.get("graph", None), linked_edges=data.get("linked_edges", None), 149 deserializing=True) 150 return nmap

File D:\miniconda3\envs\Map_Matching\lib\site-packages\leuvenmapmatching\map\inmem.py:81, in InMemMap.init(self, name, use_latlon, use_rtree, index_edges, crs_lonlat, crs_xy, graph, linked_edges, dir, deserializing) 79 self.use_rtree = use_rtree 80 if self.use_rtree: ---> 81 self.setup_index(deserializing=deserializing) 83 self.crs_lonlat = 'EPSG:4326' if crs_lonlat is None else crs_lonlat # GPS 84 self.crs_xy = 'EPSG:3395' if crs_xy is None else crs_xy # Mercator projection

File D:\miniconda3\envs\Map_Matching\lib\site-packages\leuvenmapmatching\map\inmem.py:384, in InMemMap.setup_index(self, force, deserializing) 382 else: 383 logger.debug(f"Creating new in-memory rtree index (args={args}) ...") --> 384 self.rtree = rtree.index.Index(*args) 385 t_delta = time.time() - t_start 386 logger.debug(f"... done: rtree size = {self.rtree_size()}, time = {t_delta} sec")

File D:\miniconda3\envs\Map_Matching\lib\site-packages\rtree\index.py:273, in Index.init(self, *args, **kwargs) 271 if stream and self.properties.type == RT_RTree: 272 self._exception = None --> 273 self.handle = self._create_idx_from_stream(stream) 274 if self._exception: 275 raise self._exception

File D:\miniconda3\envs\Map_Matching\lib\site-packages\rtree\index.py:1263, in Index._create_idx_from_stream(self, stream) 1260 return 0 1262 stream = core.NEXTFUNC(py_next_item) -> 1263 return IndexStreamHandle(self.properties.handle, stream)

File D:\miniconda3\envs\Map_Matching\lib\site-packages\rtree\index.py:1396, in Handle.init(self, *args, kwargs) 1395 def init(self, *args: Any, *kwargs: Any) -> None: -> 1396 self._ptr = self._create(args, kwargs)

File D:\miniconda3\envs\Map_Matching\lib\site-packages\rtree\core.py:25, in check_void(result, func, cargs) 23 msg = f'Error in "{func.name}": {s}' 24 rt.Error_Reset() ---> 25 raise RTreeError(msg) 26 return result

RTreeError: Error in "Index_CreateWithStream": Spatial Index Error: IllegalArgumentException: SpatialIndex::DiskStorageManager: Index/Data file cannot be created. Whenever I create a map in-memory object, it requires a significant amount of computation, including importing points, importing edges, and performing deduplication operations. This results in a long processing time each time I generate an in-memory map object. To overcome this, I attempted to export my pre-computed map in-memory object and load it directly from a file for future use.

Initially, I tried using Python's built-in pickle module. However, when I read the exported .pkl file, I found that it couldn't be used for path matching, as the matched paths were empty. I suspected that certain attributes, such as R-tree, might have been lost during the export of the map in-memory object.

To address this, I consulted the official documentation of LeuvenMap and discovered the dump() and deserialize() methods provided by the package. I attempted to use these recommended backup and loading methods. However, during the process, I encountered the aforementioned error.

I would greatly appreciate assistance in resolving this issue.

wannesm commented 5 months ago

First, the InMemMap class is meant for testing and simple experiments. It is not recommended for storing long computations. There is a leuvenmapmatching.map.sqlite.SqliteMap Sqlite wrapper for that.

But if you insist on using InMemMap be aware that dump and deserialize or not compatible with each other. It is dump and from_pickle on the one hand and serialize and deserialize on the other. For example:

map_con.dir = path/where/you/want/to/store/your/pickled/file/
filename = map_con.dir / (map_con.name + ".pkl")
map_con.dump()
map_con2 = InMemMap.from_pickle(filename=filename)
SheriffRabbit commented 5 months ago

First of all, thank you very much for answering my question, I am very touched that I can get a professional answer so quickly, salute to you!

Maybe I was not very clear before. The calculation I mentioned is because I need to import "road_net. geojson" data into our memory map. However, I didn't find the geojson support in our library, so I tried to use geopandas library to read the.geoJSON file first, then process the gps information in the "geometry" column, and import them into our InMemMap class according to the following steps. This whole process is the calculation I mentioned before. (I'm sorry that I may not describe it very clearly) because there is no example of geojson import on the Internet, I made this attempt, because every time in the above "calculation" process, it will take a long time, mainly for adding points and edges to the memory map, and it also needs to remove duplicate points and other operations. So I wondered if I could export this memory map object and then run it later without having to recalculate the map

from IPython.display import display, clear_output
import time
from leuvenmapmatching.matcher.distance import DistanceMatcher  
from leuvenmapmatching.map.inmem import InMemMap  

def update_progress_bar(progress):
    bar_length = 50
    block = int(round(bar_length * progress))
    progress_str = "\r[{}] {:.2%}".format("#" * block + "-" * (bar_length - block), progress)
    clear_output(wait=True)
    display(progress_str)

def build_map_in_memory(graph):
    map_con = InMemMap("myosm", use_latlon=True, use_rtree=True, index_edges=True)

    latlon_node_dict = {}
    nodeid = 0
    total_nodes = sum(len(load) for load in graph)

    for load in graph:
        for node in load:
            nodeid += 1
            lat, lon = node[1], node[0]
            map_con.add_node(nodeid, (lat, lon))
            latlon_node_dict[str((lat, lon))] = nodeid

            progress = nodeid / total_nodes
            update_progress_bar(progress)

    for i, load in enumerate(graph):
        node_a, node_b = load[0][0:2], load[-1][0:2]
        node_a = node_a[::-1]
        node_b = node_b[::-1]

        map_con.add_edge(latlon_node_dict[str(tuple(node_a))], latlon_node_dict[str(tuple(node_b))])
        map_con.add_edge(latlon_node_dict[str(tuple(node_b))], latlon_node_dict[str(tuple(node_a))])

        progress = (i + 1) / len(graph)
        update_progress_bar(progress)

    map_con.purge()

    clear_output(wait=True)

    return map_con

def match_path_to_graph(path, map_con):
    # matcher = DistanceMatcher(map_con, max_dist=30, obs_noise=20, min_prob_norm=0.5, non_emitting_states=True, only_edges=True)
    matcher = DistanceMatcher(map_con, max_dist=100, obs_noise=100, min_prob_norm=0.5, non_emitting_states=True, only_edges=True)
    states, _ = matcher.match(path)
    nodes = matcher.path_pred_onlynodes
    return states, nodes

Based on this situation, I tried the following code according to your suggestion to read the.pkl file after dump () in leuven library, and got the following error, the error information is almost the same as above, I used the administrator permission when running jupyter, the result is the same, my disk space is more than 20gb. The size of the.pkl file is 845kb, and my analysis of the disk is not the cause of the error (although it is possible to get the result without a backup by waiting about 20 minutes for the "calculation" : Because my ultimate goal is to match each set of gpspoints to get a match, but I'm just trying to figure out what the real reason for this failure is) in addition to dump () in our leuven library, I've also tried dump and load in Python's own pickle module. However, it was found that the loaded data could not be used to predict the path and the result was an empty list

The following is the error after modification according to your suggestion map_conn = InMemMap.from_pickle(filename="E:\jupyternotebook\共享单车路径匹配\map_con\myosm.pkl")

---------------------------------------------------------------------------
RTreeError                                Traceback (most recent call last)
Cell In[3], line 1
----> 1 map_conn = InMemMap.from_pickle(filename="E:\jupyternotebook\共享单车路径匹配\map_con\myosm.pkl")

File D:\miniconda3\envs\Map_Matching\lib\site-packages\leuvenmapmatching\map\inmem.py:176, in InMemMap.from_pickle(cls, filename)
    174 with filename.open("rb") as ifile:
    175     data = pickle.load(ifile)
--> 176 nmap = cls.deserialize(data)
    177 return nmap

File D:\miniconda3\envs\Map_Matching\lib\site-packages\leuvenmapmatching\map\inmem.py:144, in InMemMap.deserialize(cls, data)
    141 @classmethod
    142 def deserialize(cls, data):
    143     """Create a new instance from a dictionary."""
--> 144     nmap = cls(data["name"], dir=data.get("dir", None),
    145                use_latlon=data["use_latlon"], use_rtree=data["use_rtree"],
    146                index_edges=data["index_edges"],
    147                crs_lonlat=data.get("crs_lonlat", None), crs_xy=data.get("crs_xy", None),
    148                graph=data.get("graph", None), linked_edges=data.get("linked_edges", None),
    149                deserializing=True)
    150     return nmap

File D:\miniconda3\envs\Map_Matching\lib\site-packages\leuvenmapmatching\map\inmem.py:81, in InMemMap.__init__(self, name, use_latlon, use_rtree, index_edges, crs_lonlat, crs_xy, graph, linked_edges, dir, deserializing)
     79 self.use_rtree = use_rtree
     80 if self.use_rtree:
---> 81     self.setup_index(deserializing=deserializing)
     83 self.crs_lonlat = 'EPSG:4326' if crs_lonlat is None else crs_lonlat  # GPS
     84 self.crs_xy = 'EPSG:3395' if crs_xy is None else crs_xy  # Mercator projection

File D:\miniconda3\envs\Map_Matching\lib\site-packages\leuvenmapmatching\map\inmem.py:384, in InMemMap.setup_index(self, force, deserializing)
    382 else:
    383     logger.debug(f"Creating new in-memory rtree index (args={args}) ...")
--> 384 self.rtree = rtree.index.Index(*args)
    385 t_delta = time.time() - t_start
    386 logger.debug(f"... done: rtree size = {self.rtree_size()}, time = {t_delta} sec")

File D:\miniconda3\envs\Map_Matching\lib\site-packages\rtree\index.py:273, in Index.__init__(self, *args, **kwargs)
    271 if stream and self.properties.type == RT_RTree:
    272     self._exception = None
--> 273     self.handle = self._create_idx_from_stream(stream)
    274     if self._exception:
    275         raise self._exception

File D:\miniconda3\envs\Map_Matching\lib\site-packages\rtree\index.py:1263, in Index._create_idx_from_stream(self, stream)
   1260     return 0
   1262 stream = core.NEXTFUNC(py_next_item)
-> 1263 return IndexStreamHandle(self.properties.handle, stream)

File D:\miniconda3\envs\Map_Matching\lib\site-packages\rtree\index.py:1396, in Handle.__init__(self, *args, **kwargs)
   1395 def __init__(self, *args: Any, **kwargs: Any) -> None:
-> 1396     self._ptr = self._create(*args, **kwargs)

File D:\miniconda3\envs\Map_Matching\lib\site-packages\rtree\core.py:25, in check_void(result, func, cargs)
     23     msg = f'Error in "{func.__name__}": {s}'
     24     rt.Error_Reset()
---> 25     raise RTreeError(msg)
     26 return result

RTreeError: Error in "Index_CreateWithStream": Spatial Index Error: IllegalArgumentException: SpatialIndex::DiskStorageManager: Index/Data file cannot be created.
wannesm commented 5 months ago

That appears an error from the rtree package. Maybe the working directory is not accessible to you to write intermediate files?