amosproj / amos2024ss08-cloud-native-llm

MIT License
7 stars 1 forks source link

Identify Webpages for the Kubernetes LLM dataset #6

Closed dominic0df closed 4 months ago

dominic0df commented 5 months ago

User story

  1. As Data Engineer
  2. I want to identify websites for my dataset
  3. so that the data could be automatically extracted to a dataset

Acceptance criteria

DoD general criteria

dominic0df commented 5 months ago

Potential follow up issue:

dominic0df commented 5 months ago

Possible approaches

dominic0df commented 4 months ago

Already done work in the last sprint: SP 02

julioc-p commented 4 months ago

You can download the output file here: https://huggingface.co/datasets/Kubermatic/cncf-raw-data-for-llm-training/resolve/main/landscape_augmented_repos_websites.yml?download=true