UKPLab / sentence-transformers

State-of-the-Art Text Embeddings
https://www.sbert.net
Apache License 2.0
15.39k stars 2.49k forks source link

can't train with triplet loss #407

Open sueqian6 opened 4 years ago

sueqian6 commented 4 years ago

Hi,

I tried to train on the wikipedia dataset with triplet loss using the code given here: https://github.com/UKPLab/sentence-transformers/blob/master/examples/training/other/training_wikipedia_sections.py

However, I get this error: TypeError: must be real number, not NoneType

image

The same happens when I use custom triplet dataset.

Thanks.

nreimers commented 4 years ago

I just updated the example. Please check if it works.

sueqian6 commented 4 years ago

I just updated the example. Please check if it works.

Thank you! That's super fast.

omerarshad commented 4 years ago

@sueqian6 @nreimers where can i find the data to train on tripet loss ?

nreimers commented 4 years ago

Download links are in the respective example scripts

omerarshad commented 4 years ago

it says "You can get the dataset by running examples/datasets/get_data.py", but there is no such folder "datasets" inside "examples"

nreimers commented 4 years ago

Which script / URL are you referring to?

In the most recent scripts on github, the dataset folder was removed and the downloading of the needed datasets was moved to the scripts. But there might still be some old scripts, that do not download the dataset.

If you can point me to that script, I can fix it

omerarshad commented 4 years ago

training_wikipedia_sections.py

nreimers commented 4 years ago

This script: https://github.com/UKPLab/sentence-transformers/blob/master/examples/training/other/training_wikipedia_sections.py

Downloads the data automatically (line 31)

omerarshad commented 4 years ago

the line was confusing which you have removed now, thanks