-
One way we could improve space/cycle performance is to use a very simple peephole optimizer. Common pairs of words of the form ( ... -- n ) ( n -- ... ) can often be replaced by a more efficient co…
-
I wrote a hacky algorithm to find likely Joycean compounds. It excludes any words already tagged as compounds in the XML, as well as any words inside of a foreign language tag. There are plenty of fal…
-
Not having compounding is an issue.
Would it be possible to fall back on Hunspell when Morfologik does not accept a word?
This way, frequent words could be in Morfologik, the long compounding tail …
ghost updated
2 years ago
-
One of the strategies to checking words is compound formation. This is often a reasonable approximation but sometimes I want to perform a strict check that should help me find incorrect words like thi…
nkrot updated
7 years ago
-
In processing the English Lexicon, I realized there are two separate words for _sibling_ based on whether the sibling is _elder_ or _younger_ than the point of reference -- however, there is no such d…
-
I want to prevent hyphenation/line breaking of words containing hyphens ([hyphenated compounds](https://en.wiktionary.org/wiki/hyphenated_compound)). Some common examples of hyphenated compounds i…
hftf updated
2 months ago
-
This rule would fail when the constituent part of closed-form compound words are capitalised.
### Fail
```js
function unSubscribe() {}
let passWord;
let isInViewPort;
```
### Pass
```j…
-
This bug might me the root for issue #97
![screenshot from 2016-06-22 09-38-35](https://cloud.githubusercontent.com/assets/8461400/16266821/1bf64a6c-385d-11e6-9d77-59e59bf6de96.png)
-
Hello, thank you for sharing your code. Now I want to replicate your experiments on the LLama model, mainly to experiment on the compound_words dataset. However, the data you provided should be based …
xyzCS updated
3 months ago
-
Hello everyone,
I am trying to replicate the code provided for counting the frequency of co-occuring words.
```
# find frequently co-occuring words (typically compound words)
ngram2 % dfm()
n…