cyang-kth / fmm

Fast map matching, an open source framework in C++
https://fmm-wiki.github.io/
Apache License 2.0
898 stars 215 forks source link

Missing sections and reverse movements in STMATCH #70

Closed cweber9843 closed 4 years ago

cweber9843 commented 4 years ago

Hi, I am using the python API (compiled to python3) and STMATCH. Things are looking promising, however, I do not get perfect matches of the data: I either get missing links or unnecessary reverse movements (and/or both).

image

I created a standalone example for the behavior here: https://github.com/cweber9843/test_fmm/blob/master/test_fmm_on_synthetic_data.ipynb Including a parameter variation and a check of the network.

My questions:

Thanks!

cyang-kth commented 4 years ago

@cweber9843

Hi, I am using the python API (compiled to python3) and STMATCH. Things are looking promising, however, I do not get perfect matches of the data: I either get missing links or unnecessary reverse movements (and/or both).

image

I created a standalone example for the behavior here: https://github.com/cweber9843/test_fmm/blob/master/test_fmm_on_synthetic_data.ipynb Including a parameter variation and a check of the network.

My questions:

  • What is the best way to tune the parameters?

The best way would be to manually label a subset of trajectories and then try various parameters to improve accuracy as much as possible.

  • Maybe build a minimizer around them? Some kind of gradient search?

The hyper-parameter training is better to be implemented externally to the current program.

  • How can the reverse movements be avoided?

Either you do a preprocessing to smooth the trajectory to remove the reverse movement (like the example you show), or (actually modify the program) it is majorly caused by the assumption that vehicle must move along the direction of an edge. (One way is to add a tolerance for reverse movement)

  • How can a connected match be guaranteed?

A match fails if the path cannot be connected (In the example, I think is likely that some network edges may be directed rather than bidirectional.).

  • Is there any paper available for the STMATCH algorithm?

The paper is here https://dl.acm.org/doi/abs/10.1145/1653771.1653820 Lou Y, Zhang C, Zheng Y, et al. Map-matching for low-sampling-rate GPS trajectories[C]//Proceedings of the 17th ACM SIGSPATIAL international conference on advances in geographic information systems. 2009: 352-361.

FMM is quite similar to this one except that precomputation is used.

  • In the FMM-paper, a penalty is mentioned for avoiding reverse movement. Is this still implemented?

No, that has been removed from the current implementation as it is seldom used.

  • Will FMM with a precalculated UBODT file give better results?

I do not think so FMM and STMATCH follow the same principle but one use precomputation and one does not.

  • How is the visualization of the network (e.g. here: #30) done? Is this a QGis plugin?

It is directly visualized in QGIS without external plugin. You just define a directed line symbol.

  • any other hints?

I would recommend you to have a look at the network in QGIS first. Some part of it may not be connected.

Thanks!

cweber9843 commented 4 years ago

Thanks for the quick reply! I just realized that my problem really might originate from the network: In order to get directed links, I duplicated the original coordinates and exchanged source/target, but without actually reversing the line. I guess for a true directed network, I will have to revert the geometry. Thanks for nudging me in the right direction!

cweber9843 commented 4 years ago

Thank you again for your comments, the problem was in the network indeed. I now reversed each geometry, and now the network truly is bidirectional, not only having source-target pairs swapped with the same direction in the geometry.

Now STMATCH reliably finds good matches, there are little reverse movements and in my test sample I only see one false positive match. The algorithm is way less sensitive to the parameters as well. I can now go on to more realistic tracks and do more testing there.

Thanks a lot again!

cweber9843 commented 4 years ago

Sorry, one more question, and I reopen this issue since I think it is somewhat related to my network, but I don't see the mistake: When I run your notebook example for STMATCH, I get the same results as the example, especially I get a result for cpath. For my own data and network, however, I only get the point data, not the edges: image

For the green line in the plot, I lookup the edge IDs in my network, but I think this is generating some artifacts, such as the lower end of the green line in the figure above.

Do you have any hints why I don't get a cpath? I did not (yet) create my shapefile the way you described, via postGIS, but rather through geopandas and networkX.

cyang-kth commented 4 years ago

@cweber9843

The reason that you have an opath but an empty cpath is there are two matched point that is not connected in the network.

If you can actually extract that that single trajectory to a separate file traj.csv and then run with log level of 0, then you can see a more detailed log of the map matching process, which may print that some points are not connected.

stmatch --network ../data/edges.shp --gps ../data/traj.csv -k 4 -r 0.4 -e 0.5 --output mr.txt -l 0

The log file can contain hundreds of thousands of rows, you can first export it to a log file then check that log

stmatch --network ../data/edges.shp --gps ../data/trips.shp -k 4 -r 0.4 -e 0.5 --output mr.txt -l 0 > stmatch.log
cweber9843 commented 4 years ago

@cyang-kth , thanks again for the quick reply.

I now ran stmatch from command line, on a WGS84 network and data (UTM32 gave an SQLite error no such column: publication_date - do you want a bug report on this? The python API runs independently of the CRS.).

However, I can't see any more information on which points are not connected: Contents of stmatch.log: [info][stmatch_app_config.cpp:48 ] Start reading stmatch configuration from arguments [info][stmatch_app_config.cpp:105] Finish with reading stmatch arg configuration [info][stmatch_app_config.cpp:109] ---- Print configuration ---- [info][network_config.cpp:6 ] NetworkConfig [info][network_config.cpp:7 ] File name: ./data/bike_centerline_v1_extract_ANVedges_source_target_rev_renamed_mini_WGS84.shp [info][network_config.cpp:8 ] ID name: id [info][network_config.cpp:9 ] Source name: source [info][network_config.cpp:10 ] Target name: target [info][gps_config.cpp:19 ] GPS format: CSV point [info][gps_config.cpp:20 ] File name: ./data/syntetic_data_ANV_WGS84.csv [info][gps_config.cpp:21 ] ID name: id [info][gps_config.cpp:22 ] x name: x [info][gps_config.cpp:23 ] y name: y [info][gps_config.cpp:24 ] Timestamp name: timestamp [info][result_config.cpp:30 ] ResultConfig [info][result_config.cpp:31 ] File: mr.txt [info][result_config.cpp:32 ] Fields: opath cpath mgeom [info][stmatch_algorithm.cpp:22 ] STMATCHAlgorithmConfig [info][stmatch_algorithm.cpp:23 ] k 4 radius 0.4 gps_error 0.5 vmax 80 factor 1.5 [info][stmatch_app_config.cpp:114] Log level 0-trace [info][stmatch_app_config.cpp:115] Step 1 [info][stmatch_app_config.cpp:116] Use omp false [info][stmatch_app_config.cpp:117] ---- Print configuration done ---- [warning][result_config.cpp:167] Overwrite existing result file mr.txt [info][network.cpp:37 ] Read network from file ./data/bike_centerline_v1_extract_ANVedges_source_target_rev_renamed_mini_WGS84.shp [info][network.cpp:130] Number of edges 60 nodes 28 [info][network.cpp:131] Field index: id 2 source 4 target 5 [info][network.cpp:134] Read network done. [info][network_graph.cpp:17 ] Construct graph from network edges start [info][network_graph.cpp:30 ] Graph nodes 28 edges 60 [info][network_graph.cpp:31 ] Construct graph from network edges end [info][gps_reader.cpp:239] Id index 0 x index 1 y index 2 time index 3 [info][stmatch_app.cpp:33 ] Progress report step 1 [info][stmatch_app.cpp:35 ] Start to match trajectories [info][stmatch_app.cpp:64 ] Run map matching in single thread [info][stmatch_app.cpp:67 ] Progress 0 [info][stmatch_app.cpp:81 ] MM process finished [info][stmatch_app.cpp:87 ] Time takes 0.002 [info][stmatch_app.cpp:88 ] Time takes excluding input 0 [info][stmatch_app.cpp:89 ] Finish map match total points 26 matched 0 [info][stmatch_app.cpp:91 ] Matched percentage: 0 [info][stmatch_app.cpp:92 ] Point match speed: 0 [info][stmatch_app.cpp:93 ] Point match speed (excluding input): -nan [info][stmatch_app.cpp:95 ] Time takes 0.002 And mr.txt: id;opath;cpath;mgeom 123;12150,12149,10902,10902,10902,10902,10902,11310,11309,11309,10122,10122,10122,10122,9381,9381,9381,9009,9764,9009,9009,9009,9009,9009,9009,9009;;LINESTRING()

I am quite confident that my network is connected and bidirectional now: image The numbers in green and red are source and target node, respectively, and the numbers in black are the edge id. All edges are bidirectional, and all of them seem to be connected (e.g. node 126 is source for edge 154, but is target for 362). I checked this for all edges in this example. bike_centerline_v1_extract_ANVedges_source_target_rev_renamed_mini_WGS84.shp

Appreciate all hints on how to go on!

cyang-kth commented 4 years ago

@cweber9843

Sorry that I find the lowest log level is set globally to info in the previous version. I have reset it to trace in https://github.com/cyang-kth/fmm/commit/9bba978a2f420d34ba8af9bc0ccad351462c14fd. So if you pull the master and run the latest version, you should see the more detailed log of stmatch.

cweber9843 commented 4 years ago

@cyang-kth Thanks, with this version the extended log is created. However, I don't see anything suspicious, at least no explicit mention of something like "point not connected": stmatch.reslog

Some points I noticed, without knowing if it is important:

I tried to run stmatch on the example data that is provided with fmm, to check how the log should look like, but I run into an error there: stmatch_example-data.reslog [critical][gps_reader.cpp:232] Geometry column y not found when running stmatch --network ./data/edges.shp --gps ./data/gps.csv --gps_point --gps_x x --gps_y y --gps_id id --output mr.txt -l 0

cyang-kth commented 4 years ago

@cweber9843

On my side, the example is run successfully with the command you provided.

[info][network_graph.cpp:17 ] Construct graph from network edges start
[info][network_graph.cpp:30 ] Graph nodes 17 edges 30
[info][network_graph.cpp:31 ] Construct graph from network edges end
[warning][gps_reader.cpp:237] Time stamp timestamp not found, will be estimated 
[info][gps_reader.cpp:240] Id index 0 x index 1 y index 2 time index -1
[info][stmatch_app.cpp:33 ] Progress report step 100
cyang-kth commented 4 years ago

@cweber9843

Can you provide the trajectory and network that you use? The log shows that the first two points are not found to be connected. 12150,12149

The normal log of the example looks like this

[trace][stmatch_algorithm.cpp:132] Optimal path size 6
[debug][stmatch_algorithm.cpp:287] Build cpath from optimal candidate path
[trace][stmatch_algorithm.cpp:295] Insert index 0
[trace][stmatch_algorithm.cpp:300] Check a 8 b 11
[trace][network_graph.cpp:56 ] Shortest path starts
[trace][stmatch_algorithm.cpp:309] Edges found 
[trace][stmatch_algorithm.cpp:316] Insert index 1
[trace][stmatch_algorithm.cpp:300] Check a 11 b 18
[trace][network_graph.cpp:56 ] Shortest path starts
[trace][network_graph.cpp:163] Backtrack starts
[trace][network_graph.cpp:174] u 5 d 0 v 6 d 1 cost 1
[trace][network_graph.cpp:190] Find edge from 4 to 5 cost 1
[trace][network_graph.cpp:198]   Check Edge from 4 to 0 cost 1
[trace][network_graph.cpp:198]   Check Edge from 4 to 7 cost 1
[trace][network_graph.cpp:198]   Check Edge from 4 to 5 cost 1
[trace][network_graph.cpp:202]   Found edge idx 12 id 13
[trace][stmatch_algorithm.cpp:309] Edges found 12
[trace][stmatch_algorithm.cpp:316] Insert index 3
[trace][stmatch_algorithm.cpp:300] Check a 18 b 18
[trace][stmatch_algorithm.cpp:319] Insert index 3
[trace][stmatch_algorithm.cpp:300] Check a 18 b 20
[trace][network_graph.cpp:56 ] Shortest path starts
[trace][stmatch_algorithm.cpp:309] Edges found 
[trace][stmatch_algorithm.cpp:316] Insert index 4
[trace][stmatch_algorithm.cpp:300] Check a 20 b 24
[trace][network_graph.cpp:56 ] Shortest path starts
[trace][stmatch_algorithm.cpp:309] Edges found 
[trace][stmatch_algorithm.cpp:316] Insert index 5
[debug][stmatch_algorithm.cpp:323] Build cpath from optimal candidate path done
[trace][stmatch_algorithm.cpp:149] Opath is 8,11,18,18,20,24
[trace][stmatch_algorithm.cpp:150] Indices is 0,1,3,3,4,5
[trace][stmatch_algorithm.cpp:151] Complete path is 8,11,13,18,20,24
[trace][geom_algorithm.cpp:354] Offset1 0.200812 Offset2 1
[trace][geom_algorithm.cpp:354] Offset1 0 Offset2 0.376624
[info][stmatch_app.cpp:81 ] MM process finished
[info][stmatch_app.cpp:87 ] Time takes 0.149
[info][stmatch_app.cpp:88 ] Time takes excluding input 0.149
[info][stmatch_app.cpp:90 ] Finish map match total points 17 matched 17
[info][stmatch_app.cpp:91 ] Matched percentage: 1
[info][stmatch_app.cpp:92 ] Point match speed: 114.094
[info][stmatch_app.cpp:94 ] Point match speed (excluding input): 114.094
[info][stmatch_app.cpp:95 ] Time takes 0.149
cweber9843 commented 4 years ago

Ah, it is this line showing they are not connected: [trace][stmatch_algorithm.cpp:300] Check a 12150 b 12149? ...but this also shows up in the example log...

Here is the data, the network in the shapefile and the data in the csv: minimal_example_data.zip

cyang-kth commented 4 years ago

@cweber9843

The problem with your network is that you need to reverse the geometry and the source target fields. The ID of the two reverse edges should also be different.

If one edge is id 1 from source 1 to target 2 
the reverse edge should be id (other than 1) from source 2 to target 1. 
cweber9843 commented 4 years ago

Yes! That was it, thank you very much. I thought the naming of source and target would be arbitrary, since the geometry gives the direction. Renaming source and target in addition to reversing the geometry then would compensate itself - but obviously this was wrong.

Now I get the expected results, and even the trailing ends are gone. Great work, and thanks for the help!