-
```
There was a nice paper at NAACL 2009 about sentence boundary detection that
should be straightforward to implement for ClearTK. See:
http://www.icsi.berkeley.edu/pubs/speech/sbd_naacl_2009.pdf
`…
-
To be fair to the excellent developers at [spaCy](http://spacy.io) you might differentiate between our implementation of their return objects (which come from spaCy in Python lists) and our R objects,…
-
```
There was a nice paper at NAACL 2009 about sentence boundary detection that
should be straightforward to implement for ClearTK. See:
http://www.icsi.berkeley.edu/pubs/speech/sbd_naacl_2009.pdf
`…
-
The preprocessing steps seem to work for all of the WSJ data, but I'm running into some issues with the Brown test set. It might be a version issue with my Penn Treebank data and/or stanford parser, …
-
Is there any way to remove the limitation that `inputs.txt` must be ASCII text? I would love to be able to train a network that can produce plausible binary files of various types. Right now it dies h…
moyix updated
9 years ago
-
Quick summary of current features; will add on.
1. (Implemented) Core features:
- n-gram of POS consisting of n-1 words on the stack and the first word on the buffer
2. Calculating an uneasiness fa…
-
Dear LINDAT people ,
The following query, submitted by one of my students, leads to a timeout when the timeout is set to (the default value) 30. It works fine (but slowly) when the timeout is set to …
-
(gh_deepspeed) ub2004@ub2004-B85M-A0:~/llm_dev/DeepSpeedExamples/training/data_efficiency/gpt_finetuning$ python -m torch.distributed.launch --nproc_per_node=1 --master_port 12346 run_clm_no_t…
-
Like i have sentence:
'The first approach, single-molecule simulation, taken by the StochSim simulator, tracks individual molecules and their state (e.g., what other molecules they are bound to) so t…
-
It'd be nice to have a logo for the Anthology. A variation on the ACL block "a", but distinguishable. One idea is to incorporate that "a" with the coarse shape of a two-column paper.