hyperdimensional-computing / torchhd

Torchhd is a Python library for Hyperdimensional Computing and Vector Symbolic Architectures
https://torchhd.readthedocs.io
MIT License
229 stars 24 forks source link

New datasets #96

Closed denkle closed 1 year ago

denkle commented 1 year ago

The first attempt to start adding datasets from a collection used within “Do we Need Hundreds of Classifiers to Solve Real World Classification Problems?” The file for the first dataset is one of the most important ones because other files from the collection will pretty much follow what is specified in this file.

mikeheddes commented 1 year ago

Thanks for submitting this PR! It looks great, I think having all these datasets as part of the library is a great addition and from here it should not be too hard to add more of them. Great work!

mikeheddes commented 1 year ago

I am resolving some minor outstanding issues and will push my changes soon. Small question, is the number of folds always 4 or is it dataset dependent?

denkle commented 1 year ago

I am resolving some minor outstanding issues and will push my changes soon. Small question, is the number of folds always 4 or is it dataset dependent?

Yes, for datasets in the collection the number of folds is always 4.

mikeheddes commented 1 year ago

@denkle could you review my refactoring of the _load_data methods? I want to make sure I didn't break it. Otherwise I think it's good to go

denkle commented 1 year ago

@denkle could you review my refactoring of the _load_data methods? I want to make sure I didn't break it. Otherwise I think it's good to go

@mikeheddes, great revision of the code! The logic is more streamlined in multiple places! I do not see any problems with _load_data methods so assume it is good to go