issues
search
allenai
/
wimbd
What's In My Big Data (WIMBD) - a toolkit for analyzing large text datasets
Apache License 2.0
164
stars
17
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
fuzzy search with slop
#17
yanaiela
closed
4 days ago
0
Unable to run several function
#16
aflah02
closed
1 month ago
2
Index for Dolma v1.7
#15
WilliamsToTo
closed
3 days ago
2
Add `--with-locations` flag to `wimbd search`
#14
epwalsh
closed
2 months ago
0
Save documents where matched texts are from from `wimbd search`
#13
yanaiela
closed
2 months ago
0
search index for falcon-refinedweb
#12
WilliamsToTo
closed
3 days ago
1
Added S3 Support
#11
revbucket
opened
4 months ago
1
Unable to use the elastic search index: AuthError
#10
vishaal27
closed
3 days ago
1
error when search on "re_pile"
#9
WilliamsToTo
closed
3 days ago
1
error after geting cloud_id and api_key
#8
WilliamsToTo
closed
5 months ago
3
Add 'search' command for counting occurences of regex patterns
#7
epwalsh
closed
5 months ago
0
regular expression lookup
#6
WilliamsToTo
closed
5 months ago
1
Passing in a directory does not work for me
#5
revbucket
closed
5 months ago
1
Allow directories instead of just file paths
#4
epwalsh
closed
5 months ago
0
Demo Code
#3
davzoku
closed
4 months ago
2
add missing fields to manifest
#2
epwalsh
closed
7 months ago
0
Update tokenizers to latest release
#1
epwalsh
closed
7 months ago
0