nv-legate / cupynumeric

An Aspiring Drop-In Replacement for NumPy at Scale
https://docs.nvidia.com/cupynumeric
Apache License 2.0
623 stars 71 forks source link

Error "Current configuration results in zero workers" while running test suite #801

Closed wlai0611 closed 1 year ago

wlai0611 commented 1 year ago

Hello, I hope you are well. When I attempt to run in the root of the repo:

legate test.py

I receive an error regarding my current configuration:

[0 - 7f7efdb2e000]    0.000046 {4}{threads}: reservation ('Python-1 proc 1d00000000000006') cannot be satisfied
[0 - 7f7df9c89000]    0.511264 {6}{python}: python exception occurred within task:
Traceback (most recent call last):
  File "/home/walter/anaconda3/envs/simulations/lib/python3.1/site-packages/legion_top.py", line 477, in legion_python_main
    run_path(args[start], run_name='__main__')
  File "/home/walter/anaconda3/envs/simulations/lib/python3.1/site-packages/legion_top.py", line 298, in run_path
    exec(code, module.__dict__, module.__dict__)
  File "test.py", line 47, in <module>
    plan = TestPlan(config, system)
  File "/home/walter/anaconda3/envs/simulations/lib/python3.1/site-packages/legate/tester/test_plan.py", line 47, in __init__
    self._stages = [
  File "/home/walter/anaconda3/envs/simulations/lib/python3.1/site-packages/legate/tester/test_plan.py", line 48, in <listcomp>
    STAGES[feature](config, system) for feature in config.features
  File "/home/walter/anaconda3/envs/simulations/lib/python3.1/site-packages/legate/tester/stages/_linux/cpu.py", line 54, in __init__
    self._init(config, system)
  File "/home/walter/anaconda3/envs/simulations/lib/python3.1/site-packages/legate/tester/stages/test_stage.py", line 280, in _init
    self.spec = self.compute_spec(config, system)
  File "/home/walter/anaconda3/envs/simulations/lib/python3.1/site-packages/legate/tester/stages/_linux/cpu.py", line 75, in compute_spec
    workers = adjust_workers(len(cpus) // procs, config.requested_workers)
  File "/home/walter/anaconda3/envs/simulations/lib/python3.1/site-packages/legate/tester/stages/util.py", line 97, in adjust_workers
    raise RuntimeError("Current configuration results in zero workers")
RuntimeError: Current configuration results in zero workers
legion_python: /opt/conda/conda-bld/legate-core_1675127265170/work/build/_deps/legion-src/runtime/realm/python/python_module.cc:996: virtual void Realm::LocalPythonProcessor::execute_task(Realm::Processor::TaskFuncID, const Realm::ByteArrayRef&): Assertion `0' failed.
Signal 6 received by node 0, process 895982 (thread 7f7df9c89000) - obtaining backtrace
Signal 6 received by process 895982 (thread 7f7df9c89000) at: stack trace: 15 frames
  [0] = /lib/x86_64-linux-gnu/libc.so.6(+0x42520) [0x7f7efdc89520]
  [1] = /lib/x86_64-linux-gnu/libc.so.6(pthread_kill+0x12c) [0x7f7efdcdda7c]
  [2] = /lib/x86_64-linux-gnu/libc.so.6(raise+0x16) [0x7f7efdc89476]
  [3] = /lib/x86_64-linux-gnu/libc.so.6(abort+0xd3) [0x7f7efdc6f7f3]
  [4] = /lib/x86_64-linux-gnu/libc.so.6(+0x2871b) [0x7f7efdc6f71b]
  [5] = /lib/x86_64-linux-gnu/libc.so.6(+0x39e96) [0x7f7efdc80e96]
  [6] = /home/walter/anaconda3/envs/simulations/bin/../lib/librealm.so.1(+0x52805c) [0x7f7efe57805c]
  [7] = /home/walter/anaconda3/envs/simulations/bin/../lib/librealm.so.1(+0x4cc6d1) [0x7f7efe51c6d1]
  [8] = /home/walter/anaconda3/envs/simulations/bin/../lib/librealm.so.1(+0x4cc746) [0x7f7efe51c746]
  [9] = /home/walter/anaconda3/envs/simulations/bin/../lib/librealm.so.1(+0x52b449) [0x7f7efe57b449]
  [10] = /home/walter/anaconda3/envs/simulations/bin/../lib/librealm.so.1(+0x4cf34a) [0x7f7efe51f34a]
  [11] = /home/walter/anaconda3/envs/simulations/bin/../lib/librealm.so.1(+0x52a494) [0x7f7efe57a494]
  [12] = /home/walter/anaconda3/envs/simulations/bin/../lib/librealm.so.1(+0x4d4a6f) [0x7f7efe524a6f]
  [13] = /lib/x86_64-linux-gnu/libc.so.6(+0x94b43) [0x7f7efdcdbb43]
  [14] = /lib/x86_64-linux-gnu/libc.so.6(+0x126a00) [0x7f7efdd6da00]

Would anyone know how I should call

legate test.py

with proper options or configurations? Thank you!

manopapad commented 1 year ago

You'll want to call ./test.py directly, not through legate. The script itself will call legate internally. I updated the README to note this.

wlai0611 commented 1 year ago

Thanks for the reply! I tried to call ./test.py and I still receive an abbreviated form of the prior error.

(simulations) walter@clausius:~/Desktop/cunumeric$ ./test.py
Traceback (most recent call last):
  File "/home/walter/Desktop/cunumeric/./test.py", line 47, in <module>
    plan = TestPlan(config, system)
  File "/home/walter/anaconda3/envs/simulations/lib/python3.10/site-packages/legate/tester/test_plan.py", line 47, in __init__
    self._stages = [
  File "/home/walter/anaconda3/envs/simulations/lib/python3.10/site-packages/legate/tester/test_plan.py", line 48, in <listcomp>
    STAGES[feature](config, system) for feature in config.features
  File "/home/walter/anaconda3/envs/simulations/lib/python3.10/site-packages/legate/tester/stages/_linux/cpu.py", line 54, in __init__
    self._init(config, system)
  File "/home/walter/anaconda3/envs/simulations/lib/python3.10/site-packages/legate/tester/stages/test_stage.py", line 280, in _init
    self.spec = self.compute_spec(config, system)
  File "/home/walter/anaconda3/envs/simulations/lib/python3.10/site-packages/legate/tester/stages/_linux/cpu.py", line 75, in compute_spec
    workers = adjust_workers(len(cpus) // procs, config.requested_workers)
  File "/home/walter/anaconda3/envs/simulations/lib/python3.10/site-packages/legate/tester/stages/util.py", line 97, in adjust_workers
    raise RuntimeError("Current configuration results in zero workers")
RuntimeError: Current configuration results in zero workers
manopapad commented 1 year ago

@bryevdv any idea what the issue is here?

bryevdv commented 1 year ago

Can you try running with

./test.py --cpu-pin=none  --debug

What are the details of your system? (platform, OS, #cpus, etc)

bryevdv commented 1 year ago

In any case the proximate issue is that this computation results in zero workers:

        procs = config.cpus + config.utility + int(config.cpu_pin == "strict")
        workers = adjust_workers(len(cpus) // procs, config.requested_workers)

so adding --utility 0 may also be needed.

manopapad commented 1 year ago

You can't run without at least one utility processor.

bryevdv commented 1 year ago

I suppose then lowering CPUS, e.g. --cpus 1 (down from default 4) is another option

EDIT: --utility 0 only feeds into the cpu pinning / allocation that the tester computes, it should be fine.

magnatelee commented 1 year ago

@bryevdv can we make the pinning policy more lenient by default? If the test driver can't pin CPUs even for a single worker, I feel it should turn off the pinning and still create a single worker.

wlai0611 commented 1 year ago

Thanks so much for the help! I ran

./test.py --cpu-pin=none  --debug --utility 0

and the tests were able to be ran. I have a Ubuntu 22.04 with 8 CPUs and an architecture of x86_64. The output of the tests were below. Should I be concerned about the failed tests and how they may impact the usage of cunumeric?

################################################################################
###
### Test Suite Configuration
###
### * Feature stages       : cpus
### * Test files per stage : 135
### * TestSystem description   : 4 cpus / N/A gpus
###
################################################################################

################################################################################
### Entering stage: CPU (with 1 worker)
################################################################################

+LEGATE_TEST=1 REALM_SYNTHETIC_CORE_MAP= legate /home/walter/Desktop/cunumeric/examples/benchmark.py -cunumeric:test --cpus 4
[PASS] (CPU) examples/benchmark.py
+LEGATE_TEST=1 REALM_SYNTHETIC_CORE_MAP= legate /home/walter/Desktop/cunumeric/examples/black_scholes.py -cunumeric:test --cpus 4
[PASS] (CPU) examples/black_scholes.py
+LEGATE_TEST=1 REALM_SYNTHETIC_CORE_MAP= legate /home/walter/Desktop/cunumeric/examples/cg.py -cunumeric:test --cpus 4
[PASS] (CPU) examples/cg.py
+LEGATE_TEST=1 REALM_SYNTHETIC_CORE_MAP= legate /home/walter/Desktop/cunumeric/examples/cholesky.py -cunumeric:test --cpus 4
[PASS] (CPU) examples/cholesky.py
+LEGATE_TEST=1 REALM_SYNTHETIC_CORE_MAP= legate /home/walter/Desktop/cunumeric/examples/einsum.py -cunumeric:test --cpus 4
[PASS] (CPU) examples/einsum.py
+LEGATE_TEST=1 REALM_SYNTHETIC_CORE_MAP= legate /home/walter/Desktop/cunumeric/examples/gemm.py -cunumeric:test --cpus 4
[PASS] (CPU) examples/gemm.py
+LEGATE_TEST=1 REALM_SYNTHETIC_CORE_MAP= legate /home/walter/Desktop/cunumeric/examples/indexing_routines.py -cunumeric:test --cpus 4
[PASS] (CPU) examples/indexing_routines.py
+LEGATE_TEST=1 REALM_SYNTHETIC_CORE_MAP= legate /home/walter/Desktop/cunumeric/examples/jacobi.py -cunumeric:test --cpus 4
[PASS] (CPU) examples/jacobi.py
+LEGATE_TEST=1 REALM_SYNTHETIC_CORE_MAP= legate /home/walter/Desktop/cunumeric/examples/kmeans.py -cunumeric:test --cpus 4
[PASS] (CPU) examples/kmeans.py
+LEGATE_TEST=1 REALM_SYNTHETIC_CORE_MAP= legate /home/walter/Desktop/cunumeric/examples/kmeans_slow.py -cunumeric:test --cpus 4
[PASS] (CPU) examples/kmeans_slow.py
+LEGATE_TEST=1 REALM_SYNTHETIC_CORE_MAP= legate /home/walter/Desktop/cunumeric/examples/linreg.py -cunumeric:test --cpus 4
[PASS] (CPU) examples/linreg.py
+LEGATE_TEST=1 REALM_SYNTHETIC_CORE_MAP= legate /home/walter/Desktop/cunumeric/examples/logreg.py -cunumeric:test --cpus 4
[PASS] (CPU) examples/logreg.py
+LEGATE_TEST=1 REALM_SYNTHETIC_CORE_MAP= legate /home/walter/Desktop/cunumeric/examples/lstm_backward.py -cunumeric:test --cpus 4
[PASS] (CPU) examples/lstm_backward.py
+LEGATE_TEST=1 REALM_SYNTHETIC_CORE_MAP= legate /home/walter/Desktop/cunumeric/examples/lstm_forward.py -cunumeric:test --cpus 4
[PASS] (CPU) examples/lstm_forward.py
+LEGATE_TEST=1 REALM_SYNTHETIC_CORE_MAP= legate /home/walter/Desktop/cunumeric/examples/richardson_lucy.py -cunumeric:test --cpus 4
[PASS] (CPU) examples/richardson_lucy.py
+LEGATE_TEST=1 REALM_SYNTHETIC_CORE_MAP= legate /home/walter/Desktop/cunumeric/examples/scan.py -cunumeric:test --cpus 4
[PASS] (CPU) examples/scan.py
+LEGATE_TEST=1 REALM_SYNTHETIC_CORE_MAP= legate /home/walter/Desktop/cunumeric/examples/solve.py -cunumeric:test --cpus 4
[PASS] (CPU) examples/solve.py
+LEGATE_TEST=1 REALM_SYNTHETIC_CORE_MAP= legate /home/walter/Desktop/cunumeric/examples/sort.py -cunumeric:test --cpus 4
[PASS] (CPU) examples/sort.py
+LEGATE_TEST=1 REALM_SYNTHETIC_CORE_MAP= legate /home/walter/Desktop/cunumeric/examples/stencil.py -cunumeric:test --cpus 4
[PASS] (CPU) examples/stencil.py
+LEGATE_TEST=1 REALM_SYNTHETIC_CORE_MAP= legate /home/walter/Desktop/cunumeric/tests/integration/test_0d_store.py -cunumeric:test --cpus 4
[FAIL] (CPU) tests/integration/test_0d_store.py (exit: 1)
+LEGATE_TEST=1 REALM_SYNTHETIC_CORE_MAP= legate /home/walter/Desktop/cunumeric/tests/integration/test_advanced_indexing.py -cunumeric:test --cpus 4
[FAIL] (CPU) tests/integration/test_advanced_indexing.py (exit: 1)
+LEGATE_TEST=1 REALM_SYNTHETIC_CORE_MAP= legate /home/walter/Desktop/cunumeric/tests/integration/test_append.py -cunumeric:test --cpus 4
[FAIL] (CPU) tests/integration/test_append.py (exit: 1)
+LEGATE_TEST=1 REALM_SYNTHETIC_CORE_MAP= legate /home/walter/Desktop/cunumeric/tests/integration/test_arg_reduce.py -cunumeric:test --cpus 4
[FAIL] (CPU) tests/integration/test_arg_reduce.py (exit: 1)
+LEGATE_TEST=1 REALM_SYNTHETIC_CORE_MAP= legate /home/walter/Desktop/cunumeric/tests/integration/test_argsort.py -cunumeric:test --cpus 4
[FAIL] (CPU) tests/integration/test_argsort.py (exit: 1)
+LEGATE_TEST=1 REALM_SYNTHETIC_CORE_MAP= legate /home/walter/Desktop/cunumeric/tests/integration/test_array_creation.py -cunumeric:test --cpus 4
[FAIL] (CPU) tests/integration/test_array_creation.py (exit: 1)
+LEGATE_TEST=1 REALM_SYNTHETIC_CORE_MAP= legate /home/walter/Desktop/cunumeric/tests/integration/test_array_dunders.py -cunumeric:test --cpus 4
[FAIL] (CPU) tests/integration/test_array_dunders.py (exit: 1)
+LEGATE_TEST=1 REALM_SYNTHETIC_CORE_MAP= legate /home/walter/Desktop/cunumeric/tests/integration/test_array_fallback.py -cunumeric:test --cpus 4
[FAIL] (CPU) tests/integration/test_array_fallback.py (exit: 1)
+LEGATE_TEST=1 REALM_SYNTHETIC_CORE_MAP= legate /home/walter/Desktop/cunumeric/tests/integration/test_array_split.py -cunumeric:test --cpus 4
[FAIL] (CPU) tests/integration/test_array_split.py (exit: 1)
+LEGATE_TEST=1 REALM_SYNTHETIC_CORE_MAP= legate /home/walter/Desktop/cunumeric/tests/integration/test_astype.py -cunumeric:test --cpus 4
[FAIL] (CPU) tests/integration/test_astype.py (exit: 1)
+LEGATE_TEST=1 REALM_SYNTHETIC_CORE_MAP= legate /home/walter/Desktop/cunumeric/tests/integration/test_atleast_nd.py -cunumeric:test --cpus 4
[FAIL] (CPU) tests/integration/test_atleast_nd.py (exit: 1)
+LEGATE_TEST=1 REALM_SYNTHETIC_CORE_MAP= legate /home/walter/Desktop/cunumeric/tests/integration/test_binary_op_broadcast.py -cunumeric:test --cpus 4
[FAIL] (CPU) tests/integration/test_binary_op_broadcast.py (exit: 1)
+LEGATE_TEST=1 REALM_SYNTHETIC_CORE_MAP= legate /home/walter/Desktop/cunumeric/tests/integration/test_binary_op_complex.py -cunumeric:test --cpus 4
[FAIL] (CPU) tests/integration/test_binary_op_complex.py (exit: 1)
+LEGATE_TEST=1 REALM_SYNTHETIC_CORE_MAP= legate /home/walter/Desktop/cunumeric/tests/integration/test_binary_op_typing.py -cunumeric:test --cpus 4
[FAIL] (CPU) tests/integration/test_binary_op_typing.py (exit: 1)
+LEGATE_TEST=1 REALM_SYNTHETIC_CORE_MAP= legate /home/walter/Desktop/cunumeric/tests/integration/test_binary_ufunc.py -cunumeric:test --cpus 4
[FAIL] (CPU) tests/integration/test_binary_ufunc.py (exit: 1)
+LEGATE_TEST=1 REALM_SYNTHETIC_CORE_MAP= legate /home/walter/Desktop/cunumeric/tests/integration/test_bincount.py -cunumeric:test --cpus 4
[FAIL] (CPU) tests/integration/test_bincount.py (exit: 1)
+LEGATE_TEST=1 REALM_SYNTHETIC_CORE_MAP= legate /home/walter/Desktop/cunumeric/tests/integration/test_bits.py -cunumeric:test --cpus 4
[FAIL] (CPU) tests/integration/test_bits.py (exit: 1)
+LEGATE_TEST=1 REALM_SYNTHETIC_CORE_MAP= legate /home/walter/Desktop/cunumeric/tests/integration/test_block.py -cunumeric:test --cpus 4
[FAIL] (CPU) tests/integration/test_block.py (exit: 1)
+LEGATE_TEST=1 REALM_SYNTHETIC_CORE_MAP= legate /home/walter/Desktop/cunumeric/tests/integration/test_cholesky.py -cunumeric:test --cpus 4
[FAIL] (CPU) tests/integration/test_cholesky.py (exit: 1)
+LEGATE_TEST=1 REALM_SYNTHETIC_CORE_MAP= legate /home/walter/Desktop/cunumeric/tests/integration/test_complex_ops.py -cunumeric:test --cpus 4
[FAIL] (CPU) tests/integration/test_complex_ops.py (exit: 1)
+LEGATE_TEST=1 REALM_SYNTHETIC_CORE_MAP= legate /home/walter/Desktop/cunumeric/tests/integration/test_compress.py -cunumeric:test --cpus 4
[FAIL] (CPU) tests/integration/test_compress.py (exit: 1)
+LEGATE_TEST=1 REALM_SYNTHETIC_CORE_MAP= legate /home/walter/Desktop/cunumeric/tests/integration/test_concatenate_stack.py -cunumeric:test --cpus 4
[FAIL] (CPU) tests/integration/test_concatenate_stack.py (exit: 1)
+LEGATE_TEST=1 REALM_SYNTHETIC_CORE_MAP= legate /home/walter/Desktop/cunumeric/tests/integration/test_contains.py -cunumeric:test --cpus 4
[FAIL] (CPU) tests/integration/test_contains.py (exit: 1)
+LEGATE_TEST=1 REALM_SYNTHETIC_CORE_MAP= legate /home/walter/Desktop/cunumeric/tests/integration/test_convolve.py -cunumeric:test --cpus 4
[FAIL] (CPU) tests/integration/test_convolve.py (exit: 1)
+LEGATE_TEST=1 REALM_SYNTHETIC_CORE_MAP= legate /home/walter/Desktop/cunumeric/tests/integration/test_copy.py -cunumeric:test --cpus 4
[FAIL] (CPU) tests/integration/test_copy.py (exit: 1)
+LEGATE_TEST=1 REALM_SYNTHETIC_CORE_MAP= legate /home/walter/Desktop/cunumeric/tests/integration/test_data_interface.py -cunumeric:test --cpus 4
[FAIL] (CPU) tests/integration/test_data_interface.py (exit: 1)
+LEGATE_TEST=1 REALM_SYNTHETIC_CORE_MAP= legate /home/walter/Desktop/cunumeric/tests/integration/test_diag_indices.py -cunumeric:test --cpus 4
[FAIL] (CPU) tests/integration/test_diag_indices.py (exit: 1)
+LEGATE_TEST=1 REALM_SYNTHETIC_CORE_MAP= legate /home/walter/Desktop/cunumeric/tests/integration/test_dot.py -cunumeric:test --cpus 4
[FAIL] (CPU) tests/integration/test_dot.py (exit: 1)
+LEGATE_TEST=1 REALM_SYNTHETIC_CORE_MAP= legate /home/walter/Desktop/cunumeric/tests/integration/test_einsum.py -cunumeric:test --cpus 4
[FAIL] (CPU) tests/integration/test_einsum.py (exit: 1)
+LEGATE_TEST=1 REALM_SYNTHETIC_CORE_MAP= legate /home/walter/Desktop/cunumeric/tests/integration/test_einsum_path.py -cunumeric:test --cpus 4
[FAIL] (CPU) tests/integration/test_einsum_path.py (exit: 1)
+LEGATE_TEST=1 REALM_SYNTHETIC_CORE_MAP= legate /home/walter/Desktop/cunumeric/tests/integration/test_exp.py -cunumeric:test --cpus 4
[FAIL] (CPU) tests/integration/test_exp.py (exit: 1)
+LEGATE_TEST=1 REALM_SYNTHETIC_CORE_MAP= legate /home/walter/Desktop/cunumeric/tests/integration/test_extract.py -cunumeric:test --cpus 4
[FAIL] (CPU) tests/integration/test_extract.py (exit: 1)
+LEGATE_TEST=1 REALM_SYNTHETIC_CORE_MAP= legate /home/walter/Desktop/cunumeric/tests/integration/test_eye.py -cunumeric:test --cpus 4
[FAIL] (CPU) tests/integration/test_eye.py (exit: 1)
+LEGATE_TEST=1 REALM_SYNTHETIC_CORE_MAP= legate /home/walter/Desktop/cunumeric/tests/integration/test_fallback.py -cunumeric:test --cpus 4
[FAIL] (CPU) tests/integration/test_fallback.py (exit: 1)
+LEGATE_TEST=1 REALM_SYNTHETIC_CORE_MAP= legate /home/walter/Desktop/cunumeric/tests/integration/test_fft_c2c.py -cunumeric:test --cpus 4
[FAIL] (CPU) tests/integration/test_fft_c2c.py (exit: 1)
+LEGATE_TEST=1 REALM_SYNTHETIC_CORE_MAP= legate /home/walter/Desktop/cunumeric/tests/integration/test_fft_c2r.py -cunumeric:test --cpus 4
[FAIL] (CPU) tests/integration/test_fft_c2r.py (exit: 1)
+LEGATE_TEST=1 REALM_SYNTHETIC_CORE_MAP= legate /home/walter/Desktop/cunumeric/tests/integration/test_fft_hermitian.py -cunumeric:test --cpus 4
[FAIL] (CPU) tests/integration/test_fft_hermitian.py (exit: 1)
+LEGATE_TEST=1 REALM_SYNTHETIC_CORE_MAP= legate /home/walter/Desktop/cunumeric/tests/integration/test_fft_r2c.py -cunumeric:test --cpus 4
[FAIL] (CPU) tests/integration/test_fft_r2c.py (exit: 1)
+LEGATE_TEST=1 REALM_SYNTHETIC_CORE_MAP= legate /home/walter/Desktop/cunumeric/tests/integration/test_fill.py -cunumeric:test --cpus 4
[FAIL] (CPU) tests/integration/test_fill.py (exit: 1)
+LEGATE_TEST=1 REALM_SYNTHETIC_CORE_MAP= legate /home/walter/Desktop/cunumeric/tests/integration/test_fill_diagonal.py -cunumeric:test --cpus 4
[FAIL] (CPU) tests/integration/test_fill_diagonal.py (exit: 1)
+LEGATE_TEST=1 REALM_SYNTHETIC_CORE_MAP= legate /home/walter/Desktop/cunumeric/tests/integration/test_flatten.py -cunumeric:test --cpus 4
[FAIL] (CPU) tests/integration/test_flatten.py (exit: 1)
+LEGATE_TEST=1 REALM_SYNTHETIC_CORE_MAP= legate /home/walter/Desktop/cunumeric/tests/integration/test_flip.py -cunumeric:test --cpus 4
[FAIL] (CPU) tests/integration/test_flip.py (exit: 1)
+LEGATE_TEST=1 REALM_SYNTHETIC_CORE_MAP= legate /home/walter/Desktop/cunumeric/tests/integration/test_floating.py -cunumeric:test --cpus 4
[FAIL] (CPU) tests/integration/test_floating.py (exit: 1)
+LEGATE_TEST=1 REALM_SYNTHETIC_CORE_MAP= legate /home/walter/Desktop/cunumeric/tests/integration/test_get_item.py -cunumeric:test --cpus 4
[FAIL] (CPU) tests/integration/test_get_item.py (exit: 1)
+LEGATE_TEST=1 REALM_SYNTHETIC_CORE_MAP= legate /home/walter/Desktop/cunumeric/tests/integration/test_index_routines.py -cunumeric:test --cpus 4
[FAIL] (CPU) tests/integration/test_index_routines.py (exit: 1)
+LEGATE_TEST=1 REALM_SYNTHETIC_CORE_MAP= legate /home/walter/Desktop/cunumeric/tests/integration/test_indices.py -cunumeric:test --cpus 4
[FAIL] (CPU) tests/integration/test_indices.py (exit: 1)
+LEGATE_TEST=1 REALM_SYNTHETIC_CORE_MAP= legate /home/walter/Desktop/cunumeric/tests/integration/test_ingest.py -cunumeric:test --cpus 4
[FAIL] (CPU) tests/integration/test_ingest.py (exit: 1)
+LEGATE_TEST=1 REALM_SYNTHETIC_CORE_MAP= legate /home/walter/Desktop/cunumeric/tests/integration/test_inlinemap-keeps-region-alive.py -cunumeric:test --cpus 4
[FAIL] (CPU) tests/integration/test_inlinemap-keeps-region-alive.py (exit: 1)
+LEGATE_TEST=1 REALM_SYNTHETIC_CORE_MAP= legate /home/walter/Desktop/cunumeric/tests/integration/test_inner.py -cunumeric:test --cpus 4
[FAIL] (CPU) tests/integration/test_inner.py (exit: 1)
+LEGATE_TEST=1 REALM_SYNTHETIC_CORE_MAP= legate /home/walter/Desktop/cunumeric/tests/integration/test_intra_array_copy.py -cunumeric:test --cpus 4
[FAIL] (CPU) tests/integration/test_intra_array_copy.py (exit: 1)
+LEGATE_TEST=1 REALM_SYNTHETIC_CORE_MAP= legate /home/walter/Desktop/cunumeric/tests/integration/test_jacobi.py -cunumeric:test --cpus 4
[FAIL] (CPU) tests/integration/test_jacobi.py (exit: 1)
+LEGATE_TEST=1 REALM_SYNTHETIC_CORE_MAP= legate /home/walter/Desktop/cunumeric/tests/integration/test_length.py -cunumeric:test --cpus 4
[FAIL] (CPU) tests/integration/test_length.py (exit: 1)
+LEGATE_TEST=1 REALM_SYNTHETIC_CORE_MAP= legate /home/walter/Desktop/cunumeric/tests/integration/test_linspace.py -cunumeric:test --cpus 4
[FAIL] (CPU) tests/integration/test_linspace.py (exit: 1)
+LEGATE_TEST=1 REALM_SYNTHETIC_CORE_MAP= legate /home/walter/Desktop/cunumeric/tests/integration/test_logic.py -cunumeric:test --cpus 4
[FAIL] (CPU) tests/integration/test_logic.py (exit: 1)
+LEGATE_TEST=1 REALM_SYNTHETIC_CORE_MAP= legate /home/walter/Desktop/cunumeric/tests/integration/test_logical.py -cunumeric:test --cpus 4
[FAIL] (CPU) tests/integration/test_logical.py (exit: 1)
+LEGATE_TEST=1 REALM_SYNTHETIC_CORE_MAP= legate /home/walter/Desktop/cunumeric/tests/integration/test_lstm_backward_test.py -cunumeric:test --cpus 4
[FAIL] (CPU) tests/integration/test_lstm_backward_test.py (exit: 1)
+LEGATE_TEST=1 REALM_SYNTHETIC_CORE_MAP= legate /home/walter/Desktop/cunumeric/tests/integration/test_lstm_simple_forward.py -cunumeric:test --cpus 4
[FAIL] (CPU) tests/integration/test_lstm_simple_forward.py (exit: 1)
+LEGATE_TEST=1 REALM_SYNTHETIC_CORE_MAP= legate /home/walter/Desktop/cunumeric/tests/integration/test_map_reduce.py -cunumeric:test --cpus 4
[FAIL] (CPU) tests/integration/test_map_reduce.py (exit: 1)
+LEGATE_TEST=1 REALM_SYNTHETIC_CORE_MAP= legate /home/walter/Desktop/cunumeric/tests/integration/test_mask.py -cunumeric:test --cpus 4
[FAIL] (CPU) tests/integration/test_mask.py (exit: 1)
+LEGATE_TEST=1 REALM_SYNTHETIC_CORE_MAP= legate /home/walter/Desktop/cunumeric/tests/integration/test_mask_indices.py -cunumeric:test --cpus 4
[FAIL] (CPU) tests/integration/test_mask_indices.py (exit: 1)
+LEGATE_TEST=1 REALM_SYNTHETIC_CORE_MAP= legate /home/walter/Desktop/cunumeric/tests/integration/test_matmul.py -cunumeric:test --cpus 4
[FAIL] (CPU) tests/integration/test_matmul.py (exit: 1)
+LEGATE_TEST=1 REALM_SYNTHETIC_CORE_MAP= legate /home/walter/Desktop/cunumeric/tests/integration/test_matrix_power.py -cunumeric:test --cpus 4
[FAIL] (CPU) tests/integration/test_matrix_power.py (exit: 1)
+LEGATE_TEST=1 REALM_SYNTHETIC_CORE_MAP= legate /home/walter/Desktop/cunumeric/tests/integration/test_min_on_gpu.py -cunumeric:test --cpus 4
[FAIL] (CPU) tests/integration/test_min_on_gpu.py (exit: 1)
+LEGATE_TEST=1 REALM_SYNTHETIC_CORE_MAP= legate /home/walter/Desktop/cunumeric/tests/integration/test_moveaxis.py -cunumeric:test --cpus 4
[FAIL] (CPU) tests/integration/test_moveaxis.py (exit: 1)
+LEGATE_TEST=1 REALM_SYNTHETIC_CORE_MAP= legate /home/walter/Desktop/cunumeric/tests/integration/test_msort.py -cunumeric:test --cpus 4
[FAIL] (CPU) tests/integration/test_msort.py (exit: 1)
+LEGATE_TEST=1 REALM_SYNTHETIC_CORE_MAP= legate /home/walter/Desktop/cunumeric/tests/integration/test_multi_dot.py -cunumeric:test --cpus 4
[FAIL] (CPU) tests/integration/test_multi_dot.py (exit: 1)
+LEGATE_TEST=1 REALM_SYNTHETIC_CORE_MAP= legate /home/walter/Desktop/cunumeric/tests/integration/test_ndim.py -cunumeric:test --cpus 4
[FAIL] (CPU) tests/integration/test_ndim.py (exit: 1)
+LEGATE_TEST=1 REALM_SYNTHETIC_CORE_MAP= legate /home/walter/Desktop/cunumeric/tests/integration/test_nonzero.py -cunumeric:test --cpus 4
[FAIL] (CPU) tests/integration/test_nonzero.py (exit: 1)
+LEGATE_TEST=1 REALM_SYNTHETIC_CORE_MAP= legate /home/walter/Desktop/cunumeric/tests/integration/test_norm.py -cunumeric:test --cpus 4
[FAIL] (CPU) tests/integration/test_norm.py (exit: 1)
+LEGATE_TEST=1 REALM_SYNTHETIC_CORE_MAP= legate /home/walter/Desktop/cunumeric/tests/integration/test_ones.py -cunumeric:test --cpus 4
[FAIL] (CPU) tests/integration/test_ones.py (exit: 1)
+LEGATE_TEST=1 REALM_SYNTHETIC_CORE_MAP= legate /home/walter/Desktop/cunumeric/tests/integration/test_outer.py -cunumeric:test --cpus 4
[FAIL] (CPU) tests/integration/test_outer.py (exit: 1)
+LEGATE_TEST=1 REALM_SYNTHETIC_CORE_MAP= legate /home/walter/Desktop/cunumeric/tests/integration/test_overwrite_slice.py -cunumeric:test --cpus 4
[FAIL] (CPU) tests/integration/test_overwrite_slice.py (exit: 1)
+LEGATE_TEST=1 REALM_SYNTHETIC_CORE_MAP= legate /home/walter/Desktop/cunumeric/tests/integration/test_partition.py -cunumeric:test --cpus 4
[FAIL] (CPU) tests/integration/test_partition.py (exit: 1)
+LEGATE_TEST=1 REALM_SYNTHETIC_CORE_MAP= legate /home/walter/Desktop/cunumeric/tests/integration/test_prod.py -cunumeric:test --cpus 4
[FAIL] (CPU) tests/integration/test_prod.py (exit: 1)
+LEGATE_TEST=1 REALM_SYNTHETIC_CORE_MAP= legate /home/walter/Desktop/cunumeric/tests/integration/test_put.py -cunumeric:test --cpus 4
[FAIL] (CPU) tests/integration/test_put.py (exit: 1)
+LEGATE_TEST=1 REALM_SYNTHETIC_CORE_MAP= legate /home/walter/Desktop/cunumeric/tests/integration/test_put_along_axis.py -cunumeric:test --cpus 4
[FAIL] (CPU) tests/integration/test_put_along_axis.py (exit: 1)
+LEGATE_TEST=1 REALM_SYNTHETIC_CORE_MAP= legate /home/walter/Desktop/cunumeric/tests/integration/test_putmask.py -cunumeric:test --cpus 4
[FAIL] (CPU) tests/integration/test_putmask.py (exit: 1)
+LEGATE_TEST=1 REALM_SYNTHETIC_CORE_MAP= legate /home/walter/Desktop/cunumeric/tests/integration/test_randint.py -cunumeric:test --cpus 4
[FAIL] (CPU) tests/integration/test_randint.py (exit: 1)
+LEGATE_TEST=1 REALM_SYNTHETIC_CORE_MAP= legate /home/walter/Desktop/cunumeric/tests/integration/test_random_advanced.py -cunumeric:test --cpus 4
[FAIL] (CPU) tests/integration/test_random_advanced.py (exit: 1)
bryevdv commented 1 year ago

@magnatelee I would suggest changing the defaults:

Is that what you had in mind? I think that would allow anyone with 4 cores to have tests run out of the box, and would be preferable to adding internal special case logic that is hard to describe and document. ("These parameters have these effects...except when they don't")

magnatelee commented 1 year ago

That's fair, but I was thinking of "best-effort" mode whose name would clearly match the behavior. One downside with the suggest change is that we have to constantly pass --cpu-pin=strict, as that will be the common case. Plus, we need to change the CI scripts to make sure it gets passed. What prevents us from adding that best-effort logic to the driver?

bryevdv commented 1 year ago

@magnatelee what about just switching --cpus=2 then? That should still work out-of box with 4 cores, even with strict pinning. [1] I would at least suggest starting with that and seeing if it is sufficient.

What prevents us from adding that best-effort logic to the driver?

Nothing at all, except after many years in OSS I am incredibly allergic implementing changes that would mean writing docs like "--cpus N means reserve N cpu processors, except when some other flag is set, in which case it means try to use some unspecified number of cpu processors".

One downside with the suggest change is that we have to constantly pass --cpu-pin=strict, as that will be the common case. Plus, we need to change the CI scripts to make sure it gets passed. What prevents us from adding that best-effort logic to the driver?

Also, unless we are planning to make "best effort" the default, some users would still have to actively make some configuration change in order to run (i.e. adding --best-effort, or --cpu-pin=flexible, or whatever) . But it's already possible to make successful configuration changes (see above) so why is a new, different method that muddies the clear meaning of existing parameter needed? And if we did make it the default, for the benefit of new users, then we we would still be stuck making a bunch of CI changes to make sure CI doesn't use that mode.

[1] config.cpus + config.utility + int(config.cpu_pin == "strict") == 2 + 1 + 1 == 4

manopapad commented 1 year ago

@wlai0611 Do you get any further output, possibly at the end of the test log, that gives more details on what is failing? If not, you may want to try running test.py with --verbose. The fact that all tests are failing suggests a configuration issue, that would most likely also show up if you tried to run custom scripts.

magnatelee commented 1 year ago

@magnatelee what about just switching --cpus=2 then? That should still work out-of box with 4 cores, even with strict pinning.

That sounds like a reasonable default to me.

Nothing at all, except after many years in OSS I am incredibly allergic implementing changes that would mean writing docs like "--cpus N means reserve N cpu processors, except when some other flag is set, in which case it means try to use some unspecified number of cpu processors".

I see your point, but I guess I'd treat the functionality flag (--cpus) and the performance flag (--cpu-pin) separately. Here's why:

When you give --cpus N, N means the number of logical CPUs the runtime should create to run the program. You will always get N logical CPUs regardless of whether they are pinned to physical CPUs or not. If for some reason the runtime couldn't create N logical CPUs, the runtime would raise an error. In that sense, I don't think there's anything ambiguous with --cpus and the flag is completely independent of pinning.

And if you think about the purpose of pinning, we do that only to improve performance of the testing (i.e., how quickly we can finish running tests) and it has nothing to do with the core functionality (running tests and reporting failures). We happened to go extra miles to get the tests done as quickly as possible, but the test driver would and should accomplish its job regardless of pinning. I'm not a big fan of having a performance knob interfere with the core functionality and I believe that the test driver should just run tests for a given processor count, whether it pins CPUs for workers or not, without requiring the user to do anything (unless the runtime could not cater that request). Along that reasoning, I don't expect users would and should want to know about the pinning unless they were testing some really weird corner cases. If we do want to report errors on circumstances where the pinning leads to a zero worker, we should at least help the user figure out the right configuration, though even then the driver's logic to determine the amount of resources for each worker might not be completely obvious to users. So I'd rather make this pinning strategy best effort by default so users don't even need to think about it.

That being said, I guess it'd be pretty rare that the developer would hit this zero worker case, so a reasonable default with strict pinning doesn't sound like a bad idea.

bryevdv commented 1 year ago

@magnatelee I'll start with a PR to change the default and make and make and issue about the pinning. Currently we have --cpu-pin=partial to mean "pin on platforms where pinning is possible" but maybe be can expand that to mean "pin to the extent possible (regardless of platform)" Broadening that existing meaning would be a smaller change than adding another "mode" entirely.

magnatelee commented 1 year ago

@bryevdv Sounds great. I didn't realize there's already a partial pinning mode, which is the default and also close to the behavior I was excepting. (@manopapad also pointed that out last night in an offline chat.)