Closed fvankrieken closed 2 months ago
Hex runtime comparison:
3 minutes with tiling and no distance calculated for this specific variable
Got it down to 2ish minutes locally by fiddling with hexagon size
Getting pilot projects expecations sorted out, but code is ready for review.
Not quite sure if it makes senes but leaving out export for now just because I've been tinkering up till this point for a while, and would just like to get merged. Export should be straightforward, maybe would make sense to just include here but I'd still prefer to leave as is
agree that adding new sources to the exported FGDB is a pain and worth doing after this
just to note, one insight I've seen come from the export part is unexpected geometry types
Since these datasets are not buffered, I currently have buffer as null in their staging/buffer tables, then coalesce at int_buffers__all. Do we feel like this is unsafe?
It doesn't seem unsafe, maybe isn't consistent with something like the airports that aren't actually buffered either. But down to go with this as is and revisit how we handle non-buffered buffers (call them all spatial
or something else that's inclusive of buffered and non-buffered source data)
Build is currently failing due to testing expected values, from nondeterministic row_number here
I'll work on a fix - definitely a good thing to not be having random numbers in an output table. Moving towards just unioning these things (which isn't as much of a performance cost thanks to hexes)
Build now passing, added some helpful comments in a final rebase. Good to go
I probably won't get to PR review until tomorrow, and I don't want to hold you back. Feel free to merge without my input
I probably won't get to PR review until tomorrow, and I don't want to hold you back. Feel free to merge without my input
Sounds good - I added a few helpful code comments in your honor!
closes #672
Will be easiest to go commit by commit
More maintenance-type stuff
clip_to_geom
macroMeat of new logic
buffer
as null in their staging/buffer tables, then coalesce at int_buffers__all. Do we feel like this is unsafe? Or unintuitive. It probably just makes more sense to do this upstream, with the individual tables, I did it this way just to keep those tables more specific to their source data. also, this commit does not add expectations to_sources.yml
, that comes in another commit