Open hannojg opened 4 years ago
This doesnt look to be an error in your procedure. The planet used to complete in about 7 minutes. To me it sounds like this thing is stuck in an infinite loop. My first question is did you build this from source or use docker, what version of the code are you running. If its not the latest code please try that.
If it is the latest code we'll need to figure out what data is triggering this. To do that I suspect we'll need to track how the graph expansion algorithm in the code is visiting edges. It shouldnt be possible but it seems to me it must be somehow getting stuck in a loop of the same edges.
Perhaps you can modify the code to remove the normal output of the program but rather the edgeids that have been visited. If this sounds like too much no worries we'll have to mark it as a bug and give you a hand.
Thanks for getting back at me! I've built this from source, I will retry now with the latest commits. I used this script btw: https://github.com/valhalla/valhalla/blob/master/scripts/Ubuntu_Bionic_Install.sh
The version I am running is 3.0.9
.
Could you provide me some guidance how to achieve this:
modify the code to remove the normal output of the program but rather the edgeids that have been visited.
Thank you so much in advance!
@hannojg yeah i would try with master. im fairly certain that script is now redundant (i recently made what i think are the last changes to the code to make it compile out of the box (with the right dependencies) on ubuntu up to 20.04).
try this patch:
diff --git a/src/valhalla_export_edges.cc b/src/valhalla_export_edges.cc
index 4ae1ae9b5..6147345d3 100644
--- a/src/valhalla_export_edges.cc
+++ b/src/valhalla_export_edges.cc
@@ -328,13 +328,13 @@ int main(int argc, char* argv[]) {
// keep this
edges.push_front(other);
}
-
// get the shape
std::list<PointLL> shape;
for (const auto& e : edges) {
- extend(reader, t, e, shape);
+ //extend(reader, t, e, shape);
+ std::cout << e.i << std::endl;
}
-
+/*
// output it as: shape,name,name,...
auto encoded = encode(shape);
std::cout << encoded << column_separator;
@@ -342,6 +342,7 @@ int main(int argc, char* argv[]) {
std::cout << name << (&name == &names.back() ? "" : column_separator);
}
std::cout << row_separator;
+*/
std::cout.flush();
}
If the code is stuck in a loop of edges you should see a pattern in the output
I changed src/valhalla_export_edges.cc
to what you suggested. I then rebuild the code with these commands:
cd build
cmake .. -DCMAKE_BUILD_TYPE=Release
make -j$(nproc)
sudo make install
When I then run the command
valhalla_export_edges --config valhalla.json > planet_extract.polyline
The same as before happens: nothing. No output.
What confuses me is that I then added a simple logging to the first line of the main method, but also this isn't printed to my output. Here are my diffs:
diff --git a/src/valhalla_export_edges.cc b/src/valhalla_export_edges.cc
index 4ae1ae9b5..df49b627c 100644
--- a/src/valhalla_export_edges.cc
+++ b/src/valhalla_export_edges.cc
@@ -154,6 +154,7 @@ void extend(GraphReader& reader,
// program entry point
int main(int argc, char* argv[]) {
+ LOG_INFO("Running this valhalla thing.");
bpo::options_description options("valhalla_export_edges " VALHALLA_VERSION "\n"
"\n"
" Usage: valhalla_export_edges [options]\n"
@@ -332,9 +333,10 @@ int main(int argc, char* argv[]) {
// get the shape
std::list<PointLL> shape;
for (const auto& e : edges) {
- extend(reader, t, e, shape);
+ //extend(reader, t, e, shape);
+ std::cout << e.i << std::endl;
}
-
+ /*
// output it as: shape,name,name,...
auto encoded = encode(shape);
std::cout << encoded << column_separator;
@@ -342,8 +344,9 @@ int main(int argc, char* argv[]) {
std::cout << name << (&name == &names.back() ? "" : column_separator);
}
std::cout << row_separator;
+ */
std::cout.flush();
- }
+ }
// check progress
int procent = (100.f * set) / edge_count;
@hannojg can you run it with gdb and then stop it and see where the code is? basically revert the changes. then compile it like t his:
rm -rf build
mkdir build
cd build
cmake .. -DCMAKE_BUILD_TYPE=Debug
make -j$(nproc) valhalla_export_edges
then run it directly from there with gdb
gdb --args valhalla_export_edges your_config.json
#gdb will open up the you need to start the program
run
#let it run for a while, at least a few minutes then press ctl-c to stop the debugger
where
#this will show the current stack copy paste that output to this issue
One failure in my procedure: I was redirecting the output to a file in which I wanted to save the polyline information, that's why I haven't seen any output:
valhalla_export_edges --config valhalla.json > planet_extract.polyline
When removing the >
I see the info log I added, but nothing happens after this.
When reverting and using gdb
as you described I get the following output:
Starting program: /home/pelias/valhalla/build/valhalla_export_edges ../../valhalla.json
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
^C
Program received signal SIGINT, Interrupt.
0x00005555555c2491 in valhalla::midgard::tar::header_t::verify (this=0x7fe71faa9000) at /home/pelias/valhalla/valhalla/midgard/sequence.h:572
572 sum += ((char*)&temp)[i];
(gdb) where
#0 0x00005555555c2491 in valhalla::midgard::tar::header_t::verify (this=0x7fe71faa9000) at /home/pelias/valhalla/valhalla/midgard/sequence.h:572
#1 0x00005555555c2c33 in valhalla::midgard::tar::tar (this=0x5555559db230, tar_file="/home/pelias/valhalla_tiles.tar", regular_files_only=true) at /home/pelias/valhalla/valhalla/midgard/sequence.h:608
#2 0x00005555555b8139 in valhalla::baldr::GraphReader::tile_extract_t::tile_extract_t (this=0x5555559db190, pt=...) at /home/pelias/valhalla/src/baldr/graphreader.cc:31
#3 0x00005555555b9cc6 in valhalla::baldr::GraphReader::get_extract_instance (pt=...) at /home/pelias/valhalla/src/baldr/graphreader.cc:100
#4 0x00005555555bb036 in valhalla::baldr::GraphReader::GraphReader (this=0x7fffffffe210, pt=..., tile_getter=...) at /home/pelias/valhalla/src/baldr/graphreader.cc:349
#5 0x0000555555564737 in main (argc=2, argv=0x7fffffffe408) at /home/pelias/valhalla/src/valhalla_export_edges.cc:211
(gdb)
seems your tar is either corrupt or you are using a super slow hard drive. the first thing the code does when you use it with a tar is scan the whole thing to see what tiles are in it. thats what this stack trace is showing.
the next thing to test is if this works with a smaller tar file. can you, instead of using the planet, build with a small country like liechtenstein (check geofabrik for a download)? and then run the extraction procedure. if that finishes then we can say its something about the planet tar or the drive you are reading it from. if that doesnt finish then there is something wrong with your data creation.
Okay, so the Liechtenstein polyline extract finished within seconds! I think we can exclude the hard drive as it is running on a dedicated server of a notable provider on NVMe SSD.
Would it help if I provide the tar, or the output log during tile generation?
// EDIT:
I just ran tar -tvvf valhalla_tiles.tar
and it outputs me the whole content list of whats inside the tar, no error.
@hannojg i think we need to keep debugging this to really see what is going on. the stack above shows that this is just the initial scan of the tar file. one possibility is a misconfiguration with respect to memory mapping on your machine leads to a crazy long time just to loop over the tar. are you comfortable with gdb? if so could you check to see if its getting through any of the files in the tar or if its stuck on just one or what? you'll see there is a loop over the contents of the tar, i would set a breakpoint in there and see if it ever gets around to a second iteration of the loop. or you can just add a log statement there like LOG_WARN("checking header");
specifically im talking about this loop: https://github.com/valhalla/valhalla/blob/master/valhalla/midgard/sequence.h#L603-L624
you'll see there is an if
on line 608. maybe its getting stuck in there and just looking at the file one block at a time? if so that would be something like 50000000000/512=97656250
iterations which could take a while...
also if you are sick of debugging this and just want to move on, you can extract the tar to a directory and configure your tile_dir to point at that and then run valhalla_export_edges that way. the tar configuration is faster but using a directory of tiles will work just fine
Hey, I am trying to generate polylines for the planet. The specs of the server:
I ran the following commands:
When executing the last command I get no output. The command is running for 8 hours now with no output. I see however that it pegs the CPU (one core at 99-100%).
I tried the procedure with portland-metro which worked.
Can you tell me whether there is an issue with valhalla or my procedure? Thank you so much in advance