issues
search
allenai
/
dolma
Data and tools for generating and inspecting OLMo pre-training data.
https://allenai.github.io/dolma/
Apache License 2.0
972
stars
107
forks
source link
Fixed issues and improved documentation in getting-started.md
#216
Closed
aman-17
closed
2 weeks ago
aman-17
commented
2 weeks ago
Updates:
Added a note in
getting-started.md
to guide users on selecting wiki-dump dates.
Resolved a "module not found" issue in
make_wikipedia.py
by incorporating try-except statements.
Added the
wikipedia-mixer.json
file and updated the documentation to improve clarity and ease of use.
Updates:
getting-started.md
to guide users on selecting wiki-dump dates.make_wikipedia.py
by incorporating try-except statements.wikipedia-mixer.json
file and updated the documentation to improve clarity and ease of use.