Open tijszwinkels opened 11 months ago
Hi, thanks for reporting this! Can you try running some PyTorch code that is independent of Petals in your environment? For instance, any example from the transformers library: https://github.com/huggingface/transformers/tree/main/examples/pytorch
I opted for the multiple-choice one, runs without issue.
torch_dtype=torch.float32
I'm facing the same error when running the following code:
from transformers import AutoTokenizer
from petals import AutoDistributedModelForCausalLM
model_name = "meta-llama/Llama-2-7b-chat-hf"
tokenizer = AutoTokenizer.from_pretrained(model_name)
INITIAL_PEERS = [
"/ip4/192.168.100.250/tcp/31337/p2p/QmdCqmPMqgxFHqmMbbUxuU8Hm5KwoY9zRj5s5DbiyJoPbK",
]
model = AutoDistributedModelForCausalLM.from_pretrained(model_name, initial_peers=INITIAL_PEERS)
It fails on the last line with the following error:
Aug 03 22:22:04.246 [INFO] Make sure you follow the LLaMA's terms of use: https://bit.ly/llama2-license for LLaMA 2, https://bit.ly/llama-license for LLaMA 1
Aug 03 22:22:04.246 [INFO] Using DHT prefix: Llama-2-7b-chat-hf
Floating point exception (core dumped)
CPU details: lscpu Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Byte Order: Little Endian Address sizes: 44 bits physical, 48 bits virtual CPU(s): 64 On-line CPU(s) list: 0-63 Thread(s) per core: 1 Core(s) per socket: 1 Socket(s): 64 NUMA node(s): 1 Vendor ID: GenuineIntel CPU family: 6 Model: 47 Model name: Intel(R) Xeon(R) CPU E7- 4870 @ 2.40GHz Stepping: 2 CPU MHz: 2394.006 BogoMIPS: 4788.01 Hypervisor vendor: Xen Virtualization type: full L1d cache: 2 MiB L1i cache: 2 MiB L2 cache: 16 MiB L3 cache: 1.9 GiB NUMA node0 CPU(s): 0-63
我也遇到同样的问题,排查发现应该是这里导致的core,临时措施可以先把这段代码注释掉。
I ran into this when trying to run: https://github.com/petals-infra/chat.petals.dev
But I believe this is an issue with the petals library itself. The following minimal example crashes as well:
running it:
It crashes on the last line. Please note it also crashes without the torch_dtype specification.
These are the capabilities of the virtualized CPU I'm running on: