-
As far as I could see, the Swift CLI implementation currently doesn't support negative prompts.
In the Python implementation, the negative prompt argument is tokenised, encoded into `uncond_embeddi…
-
If I try to use attached script to analyse attached tokenised piece of news from hs.fi, the lookupping gets stuck in:
{u'POS': [u'ADVERB'], u'WORD_ID': [u'ennen']}
{u'CASE': [u'NOM', u'PAR'], u'GUESS'…
-
As of right now, the BigBird model (loaded using `AutoModelForTokenClassification`) takes in encoded inputs using (`AutoTokenizer`). When the model is trained, e.g. `model(**inputs, labels=labels)` th…
-
I have digged a bit deeper into the code and have noted some issues, which are in my opinion are due to the fact, that the input string is not tokenised before. I have started to write one, which hand…
-
Add ability to pass a HF datasets with `streaming=True` and run it inside the training pipeline so we can run on very large datasets. Also understand the slowdown of using steaming over `load_from_dis…
-
There are many ways to calculate the length of some text. Bytes, characters, runes, words, lines...
I think the maxbodylength rule operates on lines, but the current warning string format does not …
-
SWIG does not recognize `separator` (') in the `integral literals` ([C++14 feature](http://en.cppreference.com/w/cpp/language/integer_literal)). Related issue: #1030.
Example of the interface:
```…
-
Fix unused imports
- [ ] Bag
```
src/SimplIR/Bag.hs:22:1: warning: [-Wunused-imports]
The import of ‘Data.Semigroup’ is redundant
except perhaps to import instances from ‘Data.Semigr…
-
When parsing large documents with tables placed in arbitrary locations on a page, I wonder if it would useful to help Tabula get its eye in as to the location of a table by giving it one or more keyw…
-
### Summary
Context driven components
### Description
As an admin user, I want to use the preview tools to see how my campaign will look to different types of users at different stages of the…