whosonfirst / go-whosonfirst-dist

Go package for working with Who's On First distributions
BSD 3-Clause "New" or "Revised" License
1 stars 1 forks source link

Inconsistent counts when creating combined distributions #16

Open stepps00 opened 4 years ago

stepps00 commented 4 years ago

When creating "combined" distribution files using the same command as found in the initial comment here, I noticed feature counts in the distribution vary between builds.

I checked out the master branch and pulled from each admin repo to get all recent changes, then built a combined distribution yesterday. This resulted in a STATUS message of:

[wof-dist-build] STATUS time to index all (5318600) : 10h47m2.070178999s

The distribution was created and I see expected outputs.

Then, I tried a second build and see this STATUS message:

[wof-dist-build] STATUS time to index all (5313171) : 10h48m0.906850201s

The distribution was also created and I see expected outputs.

I need to dig into each distribution file to see what the difference is between the two, but it is surprising to see the same command using the same code/admin data return two different counts.

stepps00 commented 4 years ago

Related (?): https://github.com/whosonfirst/go-whosonfirst-crawl/releases/tag/v0.2.0

stepps00 commented 4 years ago

I'll reply here this week with example wof:ids that are causing issue during dist creation...