Closed cyyever closed 4 months ago
@yurivict My result is
python3 a.py
99 1283.42333984375
199 853.2401123046875
299 568.3171997070312
399 379.5828552246094
499 254.54953002929688
599 171.7064666748047
699 116.80989074707031
799 80.42716217041016
899 56.31061553955078
999 40.32223129272461
1099 29.720619201660156
1199 22.68964195251465
1299 18.02572250366211
1399 14.931407928466797
1499 12.87797737121582
1599 11.514902114868164
1699 10.609926223754883
1799 10.008904457092285
1899 9.609664916992188
1999 9.344354629516602
Result: y = 0.0073340958915650845 + 0.8354617357254028 x + -0.0012652541045099497 x^2 + -0.09030362218618393 x^3
Can you try to reproduce on main branch of pytorch with cpuinfo replaced with my version. Some modifications of main are required, just try my fork https://github.com/cyyever/pytorch/tree/freebsd. No floating point computations in this PR, just system calls. The error may be due to corrupted pytorch code, you can recompile everything from source and retry.
I tried with the recent release 2.0.1
I tried with the recent release 2.0.1
Recompile with pytorch 2.0.1 should work, and debug your code with valgrind, paste the result here. Replace cpuinfo from torch/third_party, make sure pytorch links with my version, and retry. It is also possible that some bug in release 2.0.1 was fixed recently
Here is what cgdb screen looks like at the moment of failure:
79│ packages[i].processor_start = i * threads_per_package;
80│ packages[i].processor_count = threads_per_package;
81│ packages[i].core_start = i * cores_per_package;
82│ packages[i].core_count = cores_per_package;
83│ packages[i].cluster_start = i;
84│ packages[i].cluster_count = 1;
85│ cpuinfo_x86_format_package_name(x86_processor.vendor, brand_string, packages[i].name);
86│ }
87│ for (uint32_t i = 0; i < freebsd_topology.cores; i++) {
88│ cores[i] = (struct cpuinfo_core) {
89│ .processor_start = i * threads_per_core,
90│ .processor_count = threads_per_core,
91│ .core_id = i % cores_per_package,
92├───────────────────────> .cluster = clusters + i / cores_per_package,
93│ .package = packages + i / cores_per_package,
94│ .vendor = x86_processor.vendor,
95│ .uarch = x86_processor.uarch,
96│ .cpuid = x86_processor.cpuid,
97│ };
98│ }
99│ for (uint32_t i = 0; i < freebsd_topology.threads; i++) {
100│ const uint32_t smt_id = i % threads_per_core;
101│ const uint32_t core_id = i / threads_per_core;
102│ const uint32_t package_id = i / threads_per_package;
103│
104│ /* Reconstruct APIC IDs from topology components */
/disk-samsung/pytorch-work/pytorch-v2.0.1/third_party/cpuinfo/src/x86/freebsd/init.c
New UI allocated
(gdb) r x2.py
Starting program: /usr/local/bin/python3.9 x2.py
warning: File "/usr/local/lib/libpython3.9.so.1.0-gdb.py" auto-loading has been declined by your `auto-load safe-path' set to "$debugdir:$datadir/auto-load".
To enable execution of this file add
add-auto-load-safe-path /usr/local/lib/libpython3.9.so.1.0-gdb.py
line to your configuration file "/home/yuri/.config/gdb/gdbinit".
To completely disable this security protection add
set auto-load safe-path /
line to your configuration file "/home/yuri/.config/gdb/gdbinit".
For more information about this security protection see the
"Auto-loading safe path" section in the GDB manual. E.g., run from the shell:
info "(gdb)Auto-loading safe path"
Program received signal SIGFPE, Arithmetic exception.
Integer divide by zero.
0x0000000808f87fa4 in cpuinfo_x86_freebsd_init () at /disk-samsung/pytorch-work/pytorch-v2.0.1/third_party/cpuinfo/src/x86/freebsd/init.c:92
92 .cluster = clusters + i / cores_per_package,
(gdb) p cores_per_package
$1 = 0
(gdb) p &cores_per_package
Address requested for identifier "cores_per_package" which is in register $r12
(gdb) p clusters
$2 = (struct cpuinfo_cluster *) 0x853e416c0
(gdb) p i
$4 = 0
(gdb) p cores_per_package
$5 = 0
(gdb)
cores_per_package
is zero which is wrong.
Unfortunately, freebsd_topology
isn't printed by the debugger because it is somehow "optimized out".
@yurivict I could reproduce the error in another host and it is fixed.
Same Floating point exception with patches cb647773be54f308a5836ebd65a1c5ec2bea46c2 and 7948b28fd62277db51758fc1151efcc562851ea6.
Same Floating point exception with patches cb647773be54f308a5836ebd65a1c5ec2bea46c2 and 7948b28fd62277db51758fc1151efcc562851ea6. Print the output of sysctl kern.sched.topology_spec
$ sysctl kern.sched.topology_spec
kern.sched.topology_spec: <groups>
<group level="1" cache-level="3">
<cpu count="8" mask="ff,0,0,0">0, 1, 2, 3, 4, 5, 6, 7</cpu>
<children>
<group level="2" cache-level="2">
<cpu count="2" mask="3,0,0,0">0, 1</cpu>
<flags><flag name="THREAD">THREAD group</flag><flag name="SMT">SMT group</flag></flags>
</group>
<group level="2" cache-level="2">
<cpu count="2" mask="c,0,0,0">2, 3</cpu>
<flags><flag name="THREAD">THREAD group</flag><flag name="SMT">SMT group</flag></flags>
</group>
<group level="2" cache-level="2">
<cpu count="2" mask="30,0,0,0">4, 5</cpu>
<flags><flag name="THREAD">THREAD group</flag><flag name="SMT">SMT group</flag></flags>
</group>
<group level="2" cache-level="2">
<cpu count="2" mask="c0,0,0,0">6, 7</cpu>
<flags><flag name="THREAD">THREAD group</flag><flag name="SMT">SMT group</flag></flags>
</group>
</children>
</group>
</groups>
@yurivict The package number should be recognized as 1. Can you have a debug build and give me the traceback by valgrind or lldb?
"packages" comes from the sysctl value "kern.smp.cpus" which has the value 8. "cores" comes from the sysctl value "kern.smp.cores" which has the value 4. cores_per_package=freebsd_topology.cores / freebsd_topology.packages which makes it to be zero.
"packages" comes from the sysctl value "kern.smp.cpus" which has the value 8. "cores" comes from the sysctl value "kern.smp.cores" which has the value 4. cores_per_package=freebsd_topology.cores / freebsd_topology.packages which makes it to be zero.
The lastest code counts "packages" from "kern.sched.topology_spec" , which should be 1 in your host.
It's not what I have after 5 patches:
struct cpuinfo_freebsd_topology cpuinfo_freebsd_detect_topology(void) {
int packages = cpuinfo_from_freebsd_sysctl("kern.smp.cpus");
int cores = cpuinfo_from_freebsd_sysctl("kern.smp.cores");
int threads_per_core = cpuinfo_from_freebsd_sysctl("kern.smp.threads_per_core");
cpuinfo_log_debug("freebsd topology: packages = %d, cores = %d, threads_per_core = %d", packages, cores, threads_per_core);
struct cpuinfo_freebsd_topology topology = {
.packages = (uint32_t) packages,
.cores = (uint32_t) cores,
.threads_per_core = (uint32_t) threads_per_core,
.threads = (uint32_t) (threads_per_core * cores)
};
return topology;
}
Maybe you need to squash the commits.
@yurivict The history is somewhat messy. Can you remove the local repository and re-clone? I would squash until it is stable
Sorry, I had not all patches applied before. The current set of patches works fine on my machine.
@yurivict Grad to see that. What is your CPU utility? Can all core reach 100%?
This example prints that torch.get_num_threads()
is 4 for some reason.
@yurivict
num_threads /= 2;
in Pytorch c10/core/thread_pool.h line 42 so you have half of cores. No comment about the divided by 2 logic, and cpuinfo is not used.
Hi @cyyever
I updated the same PR so that it merges with the latest cpuinfo revision: https://github.com/pytorch/cpuinfo/pull/230
I verified that it works with OpenAI Whisper project, and many simple testcases.
Could you please merge it?
Thank you, Yuri
@Maratyszcza Most of listed issues have been fixed except the cores, which is returned by FreeeBSD, can you help review again?
I no longer maintain this project, maybe @malfet or @fbarchard could review?
@fbarchard Help merge it?
@malfet Help merge it?
The README.md
file should mention FreeBSD support.
The
README.md
file should mention FreeBSD support.
The support is still experimental.
The support is still experimental.
I still think you should publish it. Just mention that it is experimental.
The testcase below fails with a "Floating point exception" with this patch on FreeBSD 13.2.