HuggingFaceDataCache for load, analyse and verify HuggingFace datasets.

@osledybazo, this is really cool! I can't wait to play with this with some of those Huggingface datasets. I only made one slight tweak: I renamed and moved the test file to put it directly next to the file it's testing, like we did with RSpec. It feels better to me than trying to mirror a parallel folder structure in a separate test folder. And I like to put the _test last in the filename so that the file and the test file will be grouped together in file lists.

@dereknorrbom, there's one test in the new test for this that's commented out because it does an actual external API call to HuggingFace. I think you already have some kind of standard way to mark the tests that are not supposed to run during development builds because they don't mock external dependencies -- they really use them? Could we add that to the test that's commented out so that we have the option of using it in full integration tests?

AnthusAI / Plexus

HuggingFaceDataCache for load, analyse and verify HuggingFace datasets. #10