-
I suspect that @jakerylandwilliams & @andyreagan's https://github.com/jakerylandwilliams/partitioner could significantly improve the tokenization quality of NLTK, specifically when it comes to MWEs (M…
-
***(This initial comment has been updated in order to clarify the topic. Replies to this comment may not make sense because description has changed.)***
The grammars for Fortran need to be updated.…
-
-
My (very potentially incorrect) understanding of an "epoch" is a set of iterations over which the model is exposed to each item in the training set one time.
In trying to understand the system be…
-
Working with the HTRC data, which is OCR'd from book scans, a sizeable portion of the wordlist is simply OCR errors. While some errors are meaningful (e.g. estimate the usage of the medial S), most of…
-
Could we perhaps add some shortcuts to the individual jmdictdb entry pages for checking the ngrams for all kanji and readings? Maybe not for everyone but at least for loggged-in editors?
For exampl…
-
Following the discussion in Issue #93 I have been exploring the options for including a number of JMnedict (proper name) entries in the daily JMdict release. The base entries would stay in the JMnedic…
-
Related to r-quantities/units#134 (and others)
I'm not sure how to best discuss this. In the end, I don't think that it will be part of the `units` library, but I would like to engage both @Enchuf…
-
**Describe the bug**
Every time I open VS Code on macOS the LTEX error "_Could not run ltex-ls with Java, please see the output panel 'LTeX Language Client' for details. You might want to try offline…
-
Hi, Lyu, I read your paper Embedding API dependency graph for neural code generation and I learned a lot from it.
I downloaded the full dataset and ran the code in your GitHub repository as you sugge…