kushagra19 / Map_Matching

This is implementation of "Hidden Markov Map Matching Through Noise and Sparseness " by Paul Newson , John Krumm. Moreover, contains some modifications tried to improve results.
2 stars 0 forks source link

Hi! I have a new question about your data which loaded by pikle.load #2

Closed r8pto closed 8 months ago

r8pto commented 11 months ago

Hi! Dear friends! I've also been trying to reproduce the results of that paper recently. And I have a new problem about your data. How do you convert the raw data (.xlsx) into a data format that can be used with pickle.load, that's a piece I'm not particularly familiar with and haven't found any more clues in your code.

kushagra19 commented 11 months ago

Hi, I assume you download data from https://www.microsoft.com/en-us/research/publication/hidden-markov-map-matching-noise-sparseness/, if yes then I downloaded the text file, converted the text file into numpy data format. Once you have this, you can see in pre_process.py file how the preprocessing is done for the dataset (you can also adjust the part of the map and sampling by adjusting lines 80-83) to convert into adjoint list format which exists in saved_dicts. Also, i did not use the road network given in this link https://www.microsoft.com/en-us/research/publication/hidden-markov-map-matching-noise-sparseness/ because osmnx library for python has already saved road networks for most major cities in the world. So, if you intend to use something other than osmnx, then you will have to adjust the code a bit.

r8pto commented 11 months ago

Yes, I downloaded the data from the website. Thank your very much! I get a lot help from here.

r8pto commented 11 months ago

Aha, I ran the "Pre-processing" code and find it will take a lot time. Maybe I should find another menthods to deal with the problem.

kushagra19 commented 11 months ago

Yeah, if you are following the paper then finding neighbours for every point takes time, though once you have that HMM runs very fast. Also, you can try improving concurrency in pre-process.py which can make it more faster.

r8pto commented 11 months ago

Hi! When I uncommented block 40 to calculate the route distance, I met an Error. Have you ever met that? The Error means that the edges ID is not in the graph (osmox).
image

r8pto commented 11 months ago

Or could you upload your route_distances_FULL file? I'll see how it looks and be able to build it in other ways.

kushagra19 commented 11 months ago

I do not have the route distance full with me right now, but i faced such errors when osmnx updated their data (edge ids and vertices) but adjacency list still had the old edge ids. When this happens the graph does not have a edge which adjacency list has. Probably investigate whether that edge exists in graph or not.

r8pto commented 11 months ago

OK, Thank you again! I will explore this, maybe the downloaded road network from OSM could solve this problem.