VaibhavCodeClub / learn

Learning app for kids
MIT License
85 stars 82 forks source link

[Feature Request]: Crawler all video subtitles (transcripts) from Khan Academy to create a word or sentences list #200

Open atlantis451 opened 1 month ago

atlantis451 commented 1 month ago

Is there an existing issue for this?

Feature Description

crawler all video transcripts from Khan Academy to create a list of 'learn' words or sentences

Web scraping category

EN

  1. MATH: HIGH SCHOOL & COLLEGE https://www.khanacademy.org/math
  2. TEST PREP https://www.khanacademy.org/test-prep
  3. SCIENCE https://www.khanacademy.org/science
  4. COMPUTING https://www.khanacademy.org/computing
  5. ARTS & HUMANITIES https://www.khanacademy.org/humanities
  6. ECONOMICS https://www.khanacademy.org/economics-finance-domain
  7. READING & LANGUAGE ARTS https://www.khanacademy.org/ela
  8. LIFE SKILLS https://www.khanacademy.org/college-careers-more
  9. PARTNER COURSES https://www.khanacademy.org/partner-content

RU

  1. МАТЕМАТИКА https://ru.khanacademy.org/math
  2. ЕСТЕСТВЕННЫЕ НАУКИ https://ru.khanacademy.org/science
  3. ЭКОНОМИКА И ФИНАНСЫ https://ru.khanacademy.org/economics-finance-domain
  4. ИНФОРМАТИКА https://ru.khanacademy.org/computing
  5. ИСКУССТВО И ГУМАНИТАРНЫЕ НАУКИ https://ru.khanacademy.org/humanities

Use Case

Only by studying the 'learn' word list from Khan Academy (subtitles/transcripts) can one fully grasp the knowledge by watching the Khan Academy videos, as learning requires review.

Even someone who doesn't know English at all can study the 'learn' word list and then immediately go to Khan Academy to watch the videos and gain knowledge and skills

Benefits

Contribute to global education, especially for regions where the Khan Academy website does not support their native languages, such as Africa. They can learn from the 'learn' word list and then go to Khan Academy to acquire knowledge

Add ScreenShots

Web scraping steps: 'Enter the web scraping category' (EN, RU), go to 1. MATH: HIGH SCHOOL & COLLEGE, and navigate to the second-level directory.

1

Early math review > Enter the directory Unit 1 > Click the play icon, and the Video transcript at the bottom of the website is the subtitles

2



Combine all the subtitles from the chapters under Early math review into one file, such as Early math review.txt.

Priority

High

Record

github-actions[bot] commented 1 month ago

Hi there! Thanks for opening this issue. We appreciate your contribution to this open-source project. We aim to respond or assign your issue as soon as possible.