issues
search
amosproj
/
amos2024ss08-cloud-native-llm
MIT License
6
stars
1
forks
source link
Extract And Store Text Data From CNCF Project Webpages
#56
Closed
grayJiaaoLi
closed
2 weeks ago
grayJiaaoLi
commented
1 month ago
User story
As a data engineer
I want / need to extract the text of each page from CNCF projects into our dataset
So that we can prepare enough training data for LLM
Acceptance criteria
Utilise the extracted link of each page in CNCF projects
Extract the text data from CNCF projects
Store the extracted text data into our raw dataset
Definition of done (DoD)
All the complex logics have been tested
All feature branches have been merged and closed
New feature code has been documented
Potential new licenses have been checked
All GitHub Actions are passing
The requirement.txt is updated
DoD general criteria
Feature has been fully implemented
Feature has been merged into the mainline
All acceptance criteria were met
Product owner approved features
All tests are passing
Developers agreed to release
User story
Acceptance criteria
Definition of done (DoD)
DoD general criteria