Open lxe opened 1 year ago
I get 2 parts, 13 GB each, while the original 7b is 13 GB in total
can you share config training (e.g, batch_size, max_seq_len, ...). How many resources (VRAM, ...) for training ?
I get 2 parts, 13 GB each, while the original 7b is 13 GB in total