Closed mpadge closed 1 year ago
As i said in my comment, it looks like an entirely different optimization than what i had in mind, but that's not to say it isn't worth it – especially if you observed a 90% speedup for a large feed! Though I do wonder if you can combine both optimizations?
That being said, i don't see any problems with this code. I do wonder how the impact on a large feed with few duplicated coordinates would be. In my use case, the full DELFI feed, it looks like there are still around 475,000 unique coordinates out of about 520,000 stops. I'm happy to run it for a couple hours and see.
It shouldn't matter on such a large feed, because most of the time is the main scanning of each row of transfers anyway, and the re-indexing back out to all (unique + duplicated) stops should be negligible in that case anyway. But that sure is a heck of a lot of stops!!
Codecov Report
100.00% <100.00%> (ø)
100.00% <100.00%> (ø)
69.86% <0.00%> (-7.14%)
100.00% <0.00%> (ø)
0.00% <0.00%> (ø)
76.56% <0.00%> (+0.41%)
80.68% <0.00%> (+1.43%)
100.00% <0.00%> (+2.85%)
:mega: We’re building smart automated test selection to slash your CI/CD build times. Learn more