Refefer / fastxml

FastXML / PFastXML / PFastreXML - Implementation of Extreme Multi-label Classification
Other
149 stars 47 forks source link

UnicodeDecodeError #16

Open stefan027 opened 5 years ago

stefan027 commented 5 years ago

Hi

I'm using Python3.6.5 on Ubuntu 18.04. I'm able to train a classifier using:

trainer = Trainer(n_trees = 18)
trainer.fit(X_train_list, y_train)
trainer.save('fastxml.trained')

X_train_list above is a list of csr_matrix objects.

The settings file I get looks like this:

{"n_trees": 18, "max_leaf_size": 10, "max_labels_per_leaf": 20, "re_split": 0, "n_jobs": 1, "alpha": 0.0001, "seed": 2016, "n_epochs": 2, "n_updates": 100.0, "verbose": false, "bias": true, "subsample": 1, "loss": "log", "sparse_multiple": 25, "leaf_classifiers": false, "gamma": 30, "blend": 0.8, "leaf_eps": 1e-05, "optimization": "fastxml", "engine": "auto", "auto_weight": 32, "eps": 1e-06, "C": 1, "leaf_probs": false, "n_labels": 58}

However, when trying to use the inferencer I get UnicodeDecodeErrors such as this:

clf = Inferencer('fastxml.trained')

----------------------------------------------------

Traceback (most recent call last):
  File "fastxml/inferencer.pyx", line 123, in fastxml.inferencer.load_sparse
    values = read_row(f, 'If')
  File "fastxml/inferencer.pyx", line 107, in fastxml.inferencer.read_row
    d = f.read(struct.calcsize('I'))
  File "/usr/lib/python3.6/codecs.py", line 321, in decode
    (result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xff in position 0: invalid start byte

Thank you.

EricYangsw commented 5 years ago

Hi,I meet same problem. Have you solved this problem?

stefan027 commented 5 years ago

Hi, no unfortunately I haven't solved this problem yet.

Refefer commented 5 years ago

Sorry for the long delay; I somehow completely missed this bug report. Are you still facing the problem?

purviprajapati196 commented 4 years ago

I am facing same problem regarding encoding-decoding.

Traceback (most recent call last): File "fastxml\inferencer.pyx", line 123, in fastxml.inferencer.load_sparse File "fastxml\inferencer.pyx", line 107, in fastxml.inferencer.read_row File "C:\Program Files\Python36\lib\encodings\cp1252.py", line 23, in decode return codecs.charmap_decode(input,self.errors,decoding_table)[0] UnicodeDecodeError: 'charmap' codec can't decode byte 0x8f in position 40: character maps to Exception ignored in: 'fastxml.inferencer.load_sparse'

purviprajapati196 commented 4 years ago

Hi, no unfortunately I haven't solved this problem yet.

I am facing same problem. Kindly help me to resolve it.