Hello!
First of all, thank you very much for the repo - it is quite handy!
I just found one ambiguous moment (which seems to me, at least) which may confuse other users.
If I have a list with 10 sentences, then the function
laser_model.embed_sentences(list_of_sents, lang='en')
returns 101024 matrix.
On the other hand, if I provide language not as a string, but as a list with a single string, then the function
laser_model.embed_sentences(list_of_sents, lang=['en']) returns 11024 matrix.
At first, I thought - could it be due to some aggregation, like a mean vectors of all 10 vectors or something. While, according to code it is clearly due to ZIP function. I think it might be a good idea either to add some Warning, or raise an Error in such a case. Though, it is just a suggestion!
Hello! First of all, thank you very much for the repo - it is quite handy!
I just found one ambiguous moment (which seems to me, at least) which may confuse other users.
If I have a list with 10 sentences, then the function
laser_model.embed_sentences(list_of_sents, lang='en')
returns 101024 matrix. On the other hand, if I provide language not as a string, but as a list with a single string, then the functionlaser_model.embed_sentences(list_of_sents, lang=['en'])
returns 11024 matrix. At first, I thought - could it be due to some aggregation, like a mean vectors of all 10 vectors or something. While, according to code it is clearly due to ZIP function. I think it might be a good idea either to add some Warning, or raise an Error in such a case. Though, it is just a suggestion!