Selecting a sub(data)set and then cloning a dataset wraps the labels in a superfluous "ndarray()". This affects PytorchTextClassificationDataset and TransformersDataset.
Edit:
I noticed this because clf.predict() on the cloned dataset raised TypeError: len() of unsized object.
Bug description
Selecting a sub(data)set and then cloning a dataset wraps the labels in a superfluous "ndarray()". This affects
PytorchTextClassificationDataset
andTransformersDataset
.Edit: I noticed this because
clf.predict()
on the cloned dataset raisedTypeError: len() of unsized object
.Steps to reproduce
Example for
TransformersDataset
:Output:
Expected behavior
Expected Output:
Environment:
Python version: 3.8 small-text version: 1.3.0 small-text integrations (e.g., transformers): transformers PyTorch version (if applicable): -
Installation (pip, conda, or from source): pip CUDA version (if applicable): -
Additional information
--