yourh / AttentionXML

Implementation for "AttentionXML: Label Tree-based Attention-Aware Deep Model for High-Performance Extreme Multi-Label Text Classification"
245 stars 41 forks source link

Error when MultiLabelBinarizer get CSR matrix #36

Closed ThejaniYapa closed 10 months ago

ThejaniYapa commented 1 year ago

I am using scipy==1.11.2 and trying to train on Amazon-670K dataset. I am using the Colab notebook. Can you please help me to find a way around on this error?

[I 230818 11:55:13 main:37] Model Name: AttentionXML
[I 230818 11:55:13 main:40] Loading Training and Validation Set
[I 230818 11:55:13 main:52] Number of Labels: 34399
[I 230818 11:55:13 main:53] Size of Training Set: 880
[I 230818 11:55:13 main:54] Size of Validation Set: 120
[I 230818 11:55:13 main:56] Training
/usr/local/lib/python3.10/dist-packages/torch/utils/data/dataloader.py:560: UserWarning: This DataLoader will create 4 worker processes in total. Our suggested max number of worker in current system is 2, which is smaller than what this DataLoader is going to create. Please be aware that excessive worker creation might get DataLoader running slow or even freeze, lower the worker number to avoid potential slowness/freeze if necessary.
  warnings.warn(_create_warning_msg(
/content/drive/MyDrive/xmlc_research/attention_xml/deepxml/optimizers.py:108: UserWarning: This overload of add_ is deprecated:
    add_(Number alpha, Tensor other)
Consider using one of the following signatures instead:
    add_(Tensor other, *, Number alpha) (Triggered internally at ../torch/csrc/utils/python_arg_parser.cpp:1485.)
  exp_avg.mul_(beta1).add_(1 - beta1, grad)
[I 230818 11:58:03 models:114] SWA Initializing
Traceback (most recent call last):
  File "/content/drive/MyDrive/xmlc_research/attention_xml/main.py", line 95, in <module>
    main()
  File "/usr/local/lib/python3.10/dist-packages/click/core.py", line 764, in __call__
    return self.main(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/click/core.py", line 717, in main
    rv = self.invoke(ctx)
  File "/usr/local/lib/python3.10/dist-packages/click/core.py", line 956, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/usr/local/lib/python3.10/dist-packages/click/core.py", line 555, in invoke
    return callback(*args, **kwargs)
  File "/content/drive/MyDrive/xmlc_research/attention_xml/main.py", line 64, in main
    model.train(train_loader, valid_loader, **model_cnf['train'])
  File "/content/drive/MyDrive/xmlc_research/attention_xml/deepxml/models.py", line 76, in train
    p5, n5 = get_p_5(labels, targets), get_n_5(labels, targets)
  File "/content/drive/MyDrive/xmlc_research/attention_xml/deepxml/evaluation.py", line 42, in get_precision
    mlb = get_mlb(classes, mlb, targets)
  File "/content/drive/MyDrive/xmlc_research/attention_xml/deepxml/evaluation.py", line 33, in get_mlb
    mlb = MultiLabelBinarizer(range(targets.shape[1]), sparse_output=True)
TypeError: MultiLabelBinarizer.__init__() takes 1 positional argument but 2 positional arguments (and 1 keyword-only argument) were given
yourh commented 1 year ago

I guess you used the wrong version of sklearn.