scikit-learn-contrib / hiclass

A python library for hierarchical classification compatible with scikit-learn
BSD 3-Clause "New" or "Revised" License
114 stars 20 forks source link

Possible Bug - Error when creating numpy array out of list of lists having different length #137

Open samiul123 opened 3 days ago

samiul123 commented 3 days ago

When labels are hierarchical and represented as a list of lists and all inner lists don't have equal length, numpy array constructor gives the following error

setting an array element with a sequence. The requested array has an inhomogeneous shape after 1 dimensions. The detected shape was (1781,) + inhomogeneous part.

This happens in HierarchicalClassifier.py at line 164.

Seeing the line at 171, where you are leveling the labels, I wonder if it should be done before converting to numpy array.

image

For Sample data, where Y_train_modifed = [['a'], ['b', 'c']]

This code gives the above error. lcppn2 = LocalClassifierPerParentNode(local_classifier=model, verbose=1, bert=True) lcppn2.fit(X_train, Y_train_modified)

hi-class version: 4.12.1 numpy version: 1.26.4