Closed chris-aeviator closed 3 years ago
@pka any info why the process doesnt exit?
I think, it's a logical error like https://rust-lang.github.io/wg-async-foundations/vision/status_quo/aws_engineer/solving_a_deadlock.html
Could you make the GPKG available for reproducing the problem (my email is in the Github infos)? I would like to solve that before releasing 0.14.
Happy to provide it, the smallest one to reproduce though is
Feature Count: 2452803
and roughly 800 MB
are you willing to run this?
If I can reproduce the deadlock with it, sure!
Sending it to you via email with a link.
So I can see that my 786 MB file get's processed to a folder of 985 MB (zoom 12-20). After the "major work" has finished I can see a drop in the node's Network communication (I'm running --nodes 2
, with both having --nodeno 0
and --nodeno1
)
I say the major work seem to have finished, since I can still see unwrap errors in the t-rex log, but the cache folder won't grow anymore in size, my CPU is still utilized, but I can hear the fans go down significantly at the point the network comm stops (only around 2.5 MB/sec, can't be it)
EDIT: Maybe I have been too impatient, at least today with this dataset t-rex seems to pick up again it's work, though with around 1 core less, waiting if I can still reproduce the previous issue (that I had on a 10GB GPKG)
EDIT 2: After another 30 min CPU usage dropped another 3 cores, folder stopped at 1.7 GB, no network comms, no exit from generate, being patient this time :)
I can confirm now that with layers, that do not raise the error described inside https://github.com/t-rex-tileserver/t-rex/issues/243 do cleanly exit the generate step.
Fixed in 01108a6. Connection timeouts (#243) are now properly handled.
When running
t_rex generate --maxzoom 20 --progress
on a config that loads a.gpkg
file with around 240.000 hexagons (MULTIPOLYGON, width around 10 meter) I will get a cache folder with all zoom levels. I made sure to set the maxquery value to something ridiculously high as 100 million, until I do not see more warnings.However, I have to manually cancel the process, even after waiting 30min after the last tile has been created
otherwise the process does not seem to finish. During cache creation I can see 12 cores beeing utilized 100% (yay - nice job), after the last tile finishes the CPU usage drops but still stays high on almost all cores. The progress is not showing anymore updates and I'm unsure if something important is still running.
It might be another issue, but I keep experiencing "holes" in the resulting tiles on certain zoom levels. The maxzoom-level always works shows the complete dataset. I can see this issue in both the included viewer, as well as in my (https://deck.gl/ based) product.
incorrect (zoom 13, zoom 12, zoom 10...)
correct (zoom 14)