SkywardAI / cecilia

EDA tools and datasets generator for ML projects
https://www.kaggle.com/organizations/skywardai/datasets
Apache License 2.0
0 stars 2 forks source link

evaluate pipeline for datasets #36

Open Aisuko opened 3 months ago

Aisuko commented 3 months ago

We need to know how to define the high quality datasets which means that we need a pipeline to evaluate the similarity between the questions and the datasets

https://www.kaggle.com/code/aisuko/semantic-search https://www.kaggle.com/code/aisuko/kmeans-with-sk-learn

For small paragraph

https://www.kaggle.com/code/aisuko/in-document-search-cross-encoder