datacommonsorg / website

Code for the Data Commons website
https://datacommons.org
Apache License 2.0
24 stars 82 forks source link

[NL Embeddings] Modularize indexes by input folders; Merge curated input and alternative input #4352

Closed shifucun closed 3 months ago

shifucun commented 3 months ago

Since all the sv description csv have the same columns (dcid, sentence), the alternative csvs can be moved to the primary folder now.

For existing embeddings indexes that are union of several folders, we can use the union from runtime by specifying multiple default_indexes or use more than one indexes when dc= param is set.