issues
search
huggingface
/
dataset-viewer
Lightweight web API for visualizing and exploring any dataset - computer vision, speech, text, and tabular - stored on the Hugging Face Hub
https://huggingface.co/docs/datasets-server
Apache License 2.0
639
stars
65
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
admin UI: automatically fill the steps list
#2917
severo
closed
2 weeks ago
1
[modality detection] One image in the repo -> Image modality
#2916
severo
opened
2 weeks ago
0
Bump urllib3 from 2.0.7 to 2.2.2 in /docs in the pip group across 1 directory
#2915
dependabot[bot]
opened
2 weeks ago
1
Prevents viewer from being pinged for the datasets on both leaderboard orgs
#2914
clefourrier
closed
2 weeks ago
1
dataset-filetypes is not a small step
#2913
severo
closed
2 weeks ago
0
Additional modalities detection
#2912
severo
closed
2 weeks ago
2
feat(chart): auto deploy when secrets change
#2911
rtrompier
closed
1 week ago
1
fix: extensions are always lowercase
#2910
severo
closed
2 weeks ago
0
Detect dataset modalities using dataset-filetypes
#2909
severo
closed
2 weeks ago
0
Add `started_at` field to cached response documents
#2908
polinaeterna
opened
2 weeks ago
0
Move secrets to Infisical
#2907
rtrompier
closed
2 weeks ago
1
[Config-parquet-and-info] Compute estimated dataset info
#2906
lhoestq
closed
2 weeks ago
4
Add step dataset-filetypes
#2905
severo
closed
2 weeks ago
0
Fix skipped async tests caused by pytest-memray
#2904
albertvillanova
closed
3 weeks ago
0
Pass copies of DataFrames instead of views
#2903
albertvillanova
closed
1 week ago
1
Minor fix id with length of str dataset name
#2902
albertvillanova
closed
3 weeks ago
0
Async tests using anyio are skipped after including pytest-memray
#2901
albertvillanova
closed
3 weeks ago
1
2754 partial instead of error
#2900
severo
closed
2 weeks ago
3
Standardize access to metrics and healthcheck
#2899
AndreaFrancis
opened
3 weeks ago
0
detect more modalities
#2898
severo
closed
2 weeks ago
8
Use `HfFileSystem` in config-parquet-metadata step instead of `HttpFileSystem`
#2897
polinaeterna
closed
2 weeks ago
0
Remove Prometheus context label
#2896
albertvillanova
closed
3 weeks ago
1
Too high label cardinality metrics in Prometheus
#2895
albertvillanova
closed
3 weeks ago
1
feat(ci): add trufflehog secrets detection
#2894
McPatate
closed
3 weeks ago
0
Fix string representation of storage client
#2893
albertvillanova
closed
3 weeks ago
0
Store `started_at` or duration info in cached steps too
#2892
polinaeterna
opened
3 weeks ago
2
Make StorageClient not warn when deleting a non-existing directory
#2891
albertvillanova
closed
3 weeks ago
1
No mongo cache in DatasetRemovalPlan
#2890
lhoestq
closed
3 weeks ago
1
No auto backfill on most nfaa datasets
#2889
lhoestq
closed
3 weeks ago
0
Fix get_shape in statistics when argument is bytes, not dict
#2888
polinaeterna
closed
3 weeks ago
0
Update ruff to 0.4.8
#2887
albertvillanova
closed
3 weeks ago
3
Update uvicorn (restart expired workers)
#2886
lhoestq
closed
3 weeks ago
1
add missing deps to dev images
#2885
lhoestq
closed
3 weeks ago
0
Add retry mechanism to get_parquet_file in parquet metadata step
#2884
polinaeterna
closed
4 weeks ago
2
Update pytest to 8.2.2 and pytest-asyncio to 0.23.7
#2883
albertvillanova
closed
3 weeks ago
0
Apply recommendations from duckdb to improve speed
#2882
severo
opened
4 weeks ago
1
Remove canonical datasets from docs
#2881
albertvillanova
opened
4 weeks ago
0
Allow mnist and fashion mnist + remove canonical dataset logic
#2880
lhoestq
closed
4 weeks ago
2
Use pymongoarrow to get dataset results as dataframe
#2879
AndreaFrancis
closed
4 weeks ago
1
Remove or increase the 5GB limit?
#2878
severo
opened
1 month ago
5
Update ruff to 0.4.7
#2877
albertvillanova
closed
1 month ago
0
Feature Request: Freeze/Restart/Log Viewer Option for Users.
#2876
kargaranamir
closed
1 month ago
2
Create a new error code (retryable) for "Consistency check failed"
#2875
severo
closed
1 month ago
4
Re-add torch dependency
#2874
lhoestq
closed
1 month ago
2
add "duration" field to audio cells
#2873
severo
opened
1 month ago
0
BFF endpoint to replace multiple parallel requests
#2872
severo
opened
1 month ago
1
Run the backfill on retryable errors every 2 hours (not every 30 min)
#2871
AndreaFrancis
closed
1 month ago
1
More webdataset fixes
#2870
lhoestq
closed
1 month ago
5
Refine blocked datasets for open llm leaderboard
#2869
lhoestq
closed
1 month ago
1
memory: use pymongoarrow to get dataset results as dataframe
#2868
severo
closed
3 weeks ago
1
Previous
Next