datacommonsorg / website

Code for the Data Commons website
https://datacommons.org
Apache License 2.0
20 stars 74 forks source link

Create a new sentence->dcids preindex file for each embedding #4358

Closed shifucun closed 2 weeks ago

shifucun commented 2 weeks ago

This is the actual preindex to be built and later on the current dcid->description will be deprecated.

This also prepares removing dup description files.

Made some commands simplification to remove unused arguments and rename main andmediumtobase`.

Removed a few tests. They are either very outdated and will not be suitable with new changes or very hard to maintain. I will add some more tests after this round of changes.