-
For example, [awesome-align](https://github.com/neulab/awesome-align) supports generating word by word parallel corpus alignment, i.e. the Pharaoh format files.
Or even can we achieve this in the cur…
-
Could you please give some instructions on how to run lesson29-BERT?
I tried to do "pip3 install keras-bert" which was from the link to the other work that you are based on, then run "python3 tokeniz…
-
I have a text classification problem where I need to classify text into one of 4 categories. I would like to use sbert but read that crossencoder only takes pair input.
How do I go about doing this…
-
> A more sophisticated approach looks at the distribution of predictions, and makes
> an informed trade-off between true positive (in this context also known as recall and hit
> rate), and accuracy …
-
There seem to be 417 language varieties represented in https://opus.nlpl.eu/JW300.php. This would imply 417C2 = 86,736 undirected language pairs. However, I only count 54,376 of them, and the paper c…
-
I am using `sentence-transformers` to encode the big texts into input embeddings for a text classification task. However, I'm unsure how to compare the quality of embeddings when evaluating multiple m…
-
Hi All,
I'm trying to fine-tune an existing sentence-transformer model (all-MiniLM-L6-v2) to get better scores in my sentence similarity problem. Test data shows ~70% accuracy and I'd like to impro…
-
My thoughts from #40.
> I just worry some about data loss when converting back and forth is so easy.
There are 3 things that can be lost when converting from CoNLL-U to CG3:
1. XPOSTAG (espec…
-
Thank you for creating this wonderful package. I just had a quick question about improving the accuracy of the alignment. Do you have any suggestions about text preprocessing, especially with symbols…
-
There is already an example of a [simple imperative programming language](https://github.com/MUSTE-Project/MULLE/tree/develop/examples/grammars/programming), but there is no good exercise for it. Here…