-
Hi! I am training a language model similar to one in Sparse Text Generation project with custom input format. When I start training it can not calculate an entmax loss.
My inputs and labels both has…
-
## Summary
This RFC is to propose the Semi-Implicit two-phase solver in branch `solver/two-phase-semiimplicit`. The solver utilized semi-implicit Chorin's projection scheme which is mainly to be us…
-
Submit your final project ideas here.
-
https://arxiv.org/abs/2211.12905
-
Hi All, I'm trying to do inference using galactica-6.7B model but errors have been popping up after inferencing few examples, and I'm not sure what to do. Can anyone look at them and tell?
followin…
-
@ZacharyWills @KatherinePowell-NOAA
We received requirements from CO-OPS regarding needs for a development space. I want to ask your advice about setting up a space or possibly even a separate head…
-
### System Info
CPU Architecture: x86_64
CPU/Host memory size: 1024Gi (1.0Ti)
GPU properties:
GPU name: NVIDIA GeForce RTX 4090
GPU mem size: 24Gb…
-
Hello,
Thanks for the great tool. I'm excited to use ProtTrans for generating protein sequences, but I'm getting an index error in the example notebook (https://github.com/agemagician/ProtTrans/blo…
-
sh 参数如下:
python3 train_qlora.py \
--train_args_json chatGLM_6B_QLoRA.json \
--model_name_or_path /mnt/disk_data/soft/text-generation-webui-main/models/ChatGLM2-6B \
--train_data_path data/train.js…
-
Hello,
I'm searching for a way to visualize the attention maps of the pre-trained models but I didn't found any solution yet. Did someone already succesfully did this ?
Thank you !