pytorch / cpuinfo

CPU INFOrmation library (x86/x86-64/ARM/ARM64, Linux/Windows/Android/macOS/iOS)
BSD 2-Clause "Simplified" License
990 stars 310 forks source link

failed to get cpuinfo on aws lambda arm64 #143

Open kartheekgottipati opened 1 year ago

kartheekgottipati commented 1 year ago

AWS Lambda Arm64 pytorch 2.0.0

when running pytorch on aws lambda with pytorch 2.0.0 on arm64 i am getting the following error

[WARNING] 2023-04-10T23:55:34.026Z RUNNING WITH 1 threads Error in cpuinfo: failed to parse the list of possible processors in /sys/devices/system/cpu/possible Error in cpuinfo: failed to parse the list of present processors in /sys/devices/system/cpu/present Error in cpuinfo: failed to parse both lists of possible and present processors terminate called after throwing an instance of 'c10::Error' what(): [enforce fail at ThreadPool.cc:44] cpuinfo_initialize(). cpuinfo initialization failed frame #0: c10::ThrowEnforceNotMet(char const, int, char const, std::string const&, void const) + 0x50 (0xffff70e7ca90 in /var/task/torch/lib/libc10.so) frame #1: c10::ThrowEnforceNotMet(char const, int, char const, char const, void const*) + 0x50 (0xffff70e7cc30 in /var/task/torch/lib/libc10.so) frame #2: + 0x2c8cc78 (0xffff73b6ac78 in /var/task/torch/lib/libtorch_cpu.so) frame #3: + 0x2c8fb64 (0xffff73b6db64 in /var/task/torch/lib/libtorch_cpu.so) frame #4: at::set_num_threads(int) + 0x2c (0xffff71bc12bc in /var/task/torch/lib/libtorch_cpu.so) frame #5: + 0x58d698 (0xffff7980f698 in /var/task/torch/lib/libtorch_python.so)

frame #63: __libc_start_main + 0xe8 (0xffff84323e18 in /lib/aarch64-linux-gnu/libc.so.6) START RequestId: 6b21fcf4-19b2-45cc-83e4-74a2cefe6bad Version: $LATEST RequestId: 6b21fcf4-19b2-45cc-83e4-74a2cefe6bad Error: Runtime exited with error: signal: aborted Runtime.ExitError both x86_64 and arm64 dont have access to the files on aws lambda but x86_64 is ignoring the issue and proceeding while using arm64 it failing with above error. Any reason an error log is used for arm64 vs warning for the rest?
subhankar-trisetra commented 1 year ago

I'm having the same issue

jc-hdez commented 10 months ago

I am having the same issue, torch version 2.1.0

thecasual commented 7 months ago

any update?

stephenswetonic commented 4 months ago

I believe the issue is with onnxruntime itself and is still not resolved. I'm going to try x86 for now.

malfet commented 4 months ago

In some sense of the word it’s an expected behavior: lambda runtime doesn’t want to leak hardware details to hosted processes, so cpuinfo fails to initialize, but PyTorch crash should be fixed

StanislavMakhrov commented 4 months ago

Is any SLA for solving this bug? Issue was opened more than year ago. @soumith, @apaszke, @suo, could you please help?

pluiedev commented 2 months ago

Also problematic in restricted build environments (like Nix) that don't expose /sys/devices/system/cpu/{possible,present} to prevent packages from relying on the specific hardware configuration of the build system.

nywhere commented 2 minutes ago

Any update?