-
## 🚀 Feature
The current design in torchtext presents the user with two APIs for dataset construction:
- the "raw" API, which returns the raw text data from the dataset, and
- the one-liner build…
-
There is at least one unusable vocabulary entry in our gabert vocab, namely `##-"`. Find all entries that the BERT will never use as BERT first splits around all non-alphanumeric characters without ap…
-
If I want to recognize a picture, how can I do it? Do I still want to generate LMDB format? Can you provide a predict interface? Thank you
-
Hello, I load pre-trained llava-llama3 SFT weights and fine-tune using LoRA, but get an error when merging weights:
**scripts:**
Training:
```
deepspeed --master_port=$((RANDOM + 10000)) --inclu…
-
Please Indicate One:
* [x] Editorial
* [ ] Question
* [ ] Feedback
* [ ] Blocking Issue
* [ ] Non-Blocking Issue
Please Describe the Issue:
* via https://github.com/gobengo/activitystre…
-
When people want to categorize their content, they don't think to themselves, "I want to add a system of taxonomy to my website", they think "how do I add categories?". We need to make our user-inter…
-
Data is tokenized 2 times :
1. With Stanford CoreNLP : https://github.com/nlpyang/PreSumm/blob/ba17e95de8cde9d5ddaeeba01df7cace584511b2/src/prepro/data_builder.py#L110
2. With HuggingFace's Bert…
-
This might initially seem unrelated to the issue title, but please bare with me...
There is this slight WTF (or feature, depending on how one looks at this) with visibility conditions not being avail…
-
What features should CLAW have in order to be considered usable in terms of metadata creation and manipulation? List your features below.
-
@anuchandy @JonathanGiles Any suggestions would be helpful. :)
I've been trying to use Azure App Config with our new ServiceClient, and I've had to do a bit of weirdness because ServiceClient takes…