redpajama Search Results

514 results
for redpajama

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

go-skynet/model-gallery #1

Models to add

This is a small, initial list (feel free to suggest in the comments) of models that should be at least present IMHO: - [x] stable-diffusion - [ ] whisper - [ ] wizardLM (no links, only configura…

mudler updated 1 year ago
7
mlfoundations/open_flamingo #260

Mismatch input type and weight type when training with preci…

Hi, thanks for making this project public. I am trying to run training with fp16 and get the following error: >RuntimeError: Input type (torch.cuda.HalfTensor) and weight type (torch.cuda.FloatTen…

hungvo304ml updated 4 months ago
7
huggingface/datasets #6319

Datasets.map is severely broken

### Describe the bug Regardless of how many cores I used, I have 16 or 32 threads, map slows down to a crawl at around 80% done, lingers maybe until 97% extremely slowly and NEVER finishes the job. I…

phalexo updated 3 months ago
15
mlfoundations/open_flamingo #256

Support for Onnx format?

Do you have plans for releasing trained models in Onnx format? I tried to manually export the model `mpt-1b-redpajama-200b`to onnx format. Export generated lots of warnings like this: ``` Trac…

mhjort updated 1 year ago
1
togethercomputer/RedPajama-Data #79

regarding to deduplication

Hey, thank you in advance for your great work and sharing the data :) I read README and huggingface details and was unclear whether fuzzy deduplication is actually done on this dataset. I underst…

kimcando updated 5 months ago
6
LLukas22/llm-rs-python #27

GPU Not Utilized When Using llm-rs with CUDA Version

I have installed the llm-rs library with the CUDA version, However, even though I have set `use_gpu=True` in the `SessionConfig`, the GPU is not utilized when running the code. Instead, the CPU usage …

andri-jpg updated 1 year ago
2
sangmichaelxie/doremi #15

How many rounds do we need to converge domain weights on The…

Thanks for your awesome work! I noticed that there is a optimized weights called `configs/pile_doremi_r1_120M_ref:pile_baseline_50kvocab_nopack_120M.json` as shown in README. Can we consider this doma…

ouyangliqi updated 1 year ago
1
mlfoundations/open_flamingo #277

Unable to run on GPU with error: "Cannot copy out of meta te…

I'm trying to follow the instructions in https://github.com/mlfoundations/open_flamingo/issues/228 in order to run inference on a GPU. My code looks as follows: ``` model, image_processor, tokeni…

shomikj updated 10 months ago
2
OpenThaiGPT/openthaigpt-pretraining #180

[SIIT] Convert MPT to support gradient checkpointing, LoRa

**Description:** Add MPT with Gradient Checkpointing and LoRa support into OpenThaiGPT pertaining code. We will use MPT with Lora for continue pertaining to task #179 **To Do:** 1. MPT Weight + MP…

ArthurMinovsky updated 1 year ago
4
locuslab/massive-activations #3

How to get the mean value of massive activation

1. How to get the mean value of massive activation？e.g. 2546.8/-1502.0 in hook.py 2. Mean value is still large, what is the difference between using the mean value and using the original value?

pengyao96 updated 8 months ago
1

上一页 1...2 3 4 5 6 7 8...52 下一页

514 results for redpajama

514 results
for redpajama