datacommonsorg / website

Code for the Data Commons website
https://datacommons.org
Apache License 2.0
24 stars 82 forks source link

Major re-structure of build-embeddings tool; Use multiple sv/topic with the same description now #4371

Closed shifucun closed 3 months ago

shifucun commented 3 months ago

Major changes:

Minor changes:

TODO:

shifucun commented 3 months ago

Epic PR indeed!

Could this be broken into multiple smaller PRs by any chance?

This is one of the smaller PR after break up :) At this point, the effective code change is the two build_embeddings script which is very hard to do point fix as all the fundamental functions changes. The preindex file diff is a guard here.