-
Hi, I am trying to do the serving on gpt-j 6B model using TPUv3-8. For which I am using saxml framework,
The error is coming when I am doing the model conversion from pytorch to pax format which i…
-
-
in lora paper section 3:
Adapter Layers Introduce Inference Latency :There are many variants of adapters. We focus
on the original design by Houlsby et al. (2019) which has two adapter layers per …
-
In HF transformers, the default setting of qlora does not replicate the qlora of the original paper, leaving valuable performance lying on the ML practitioners street using lib defaults.
One has to…
-
```
Clarification is needed on how to adopt the returned Soap-fault for different
kind of error conditions and error codes.
There is a soap-fault transformer generated by default and declared:
a) in…
-
Hey, thanks for the great implementation!
I have two questions for you.
1) Since you have implemented the slot attention, could you say what difference do you find technically between the slot …
MLDeS updated
10 months ago
-
Hi @m-bain,
This is a very cool repository and definitely useful for getting more reliable and accurate timestamps for the generated transcriptions.
I was wondering if you'd like to extend the cur…
-
This has been bugging me for a while but can you include the raw json data in the returned pydantic model? Sometimes the assumption that you used to parse the json isn't correct, and I would have to h…
-
In the original [SanitizationFilter](https://github.com/gjtorikian/html-pipeline/blob/v2.14.3/lib/html/pipeline/sanitization_filter.rb), which uses `Sanitize`, you had two transformers
```ruby
…
-
### Before submitting a bug, please make sure following checks.
- [X] You have finished loading all model packs before login to world/server.
- [X] You're using the latest stable, snapshot, or dai…