datacommonsorg / website

Code for the Data Commons website
https://datacommons.org
Apache License 2.0
20 stars 74 forks source link

Consolidate embeddings build by distilling custom DC treatment as a standalone step #4342

Closed shifucun closed 3 weeks ago

shifucun commented 3 weeks ago

Handle GCS and local files by a FileManager class and this can be applied in base DC as well.

All custom DC process is handled by one function save_custom_dc_artifacts. This way, consolidating the two build.py files will be easily achieved later.