viking-sudo-rm / rusty-dawg

Rust library for indexing and quickly searching large pretraining corpora
https://arxiv.org/abs/2406.13069
MIT License
17 stars 2 forks source link

`DiskCdawg::load` fails silently if metadata exists but is empty #98

Open viking-sudo-rm opened 5 months ago

viking-sudo-rm commented 5 months ago

Ideally an error message should indicate the problem. Relatedly, creating and saving to RAM should not create an empty metadata file.

In general, the transparency of error messages when using the Python bindings should be better (rather than opaque messages about unwrap failures)