nv-legate / legate.core

The Foundation for All Legate Libraries
https://docs.nvidia.com/legate/24.06/
Apache License 2.0
189 stars 63 forks source link

[BUG] Dimension mismatch: invalid to create a 1-D accessor to a 2-D store #947

Closed syamajala closed 3 months ago

syamajala commented 5 months ago

Software versions

cunumeric: 9ece0a31b9 legate.core: 0f509a007f36

Jupyter notebook / Jupyter Lab version

No response

Expected behavior

Code should work.

Observed behavior

The following line: https://github.com/syamajala/levenberg-marquardt-method/blob/main/levenberg_marquardt.py#L393

cvg_hst[iteration-1,i+2] = p.T[0][i]

results in this error:

/sdf/home/s/seshu/levenberg-marquardt-method/levenberg_marquardt.py:341: RuntimeWarning: cuNumeric has not implemented
inv and is falling back to canonical NumPy. You may notice significantly decreased performance for this function call.
  rho = np.matmul( h.T @ (lambda_ * h + JtWdy),np.linalg.inv(X2 - X2_try))
0
terminate called after throwing an instance of 'std::invalid_argument'
  what():  Dimension mismatch: invalid to create a 1-D accessor to a 2-D store

I also tried:

cvg_hst[iteration-1,i+2] = p.T[0,i]

but still see the same error.

Example code or instructions

Clone this repository: https://github.com/syamajala/levenberg-marquardt-method/tree/main

Run example_LM.py

Stack traceback or browser console output

No response

syamajala commented 3 months ago

@manopapad a version of this bug is still there in legate.core 24.06.00:

terminate called after throwing an instance of 'std::invalid_argument'                                                
  what():  Dimension mismatch: invalid to retrieve a 1-D rect from a 2-D store 

Signal 6 received by process 1920639 (thread 7fdba018c000) at: stack trace: 17 frames                                 
  [0] = raise at unknown file:0 [00007fdbaa6a1a9f]
  [1] = abort at unknown file:0 [00007fdbaa674e04]
  [2] = __gnu_cxx::__verbose_terminate_handler() at ../../../../libstdc++-v3/libsupc++/vterminate.cc:95 [00007fdbadafbf9d]
  [3] = __cxxabiv1::__terminate(void (*)()) at ../../../../libstdc++-v3/libsupc++/eh_terminate.cc:48 [00007fdbadafa4e1]
  [4] = std::terminate() at ../../../../libstdc++-v3/libsupc++/eh_terminate.cc:58 [00007fdbadaf42e2]                  
  [5] = __cxa_throw at ../../../../libstdc++-v3/libsupc++/eh_throw.cc:98 [00007fdbadafa701]                           
  [6] = legate::detail::PhysicalStore::check_shape_dimension_(int) const [clone .cold] at unknown file:0 [00007fda66d57e3f]
  [7] = Realm::Rect<1, long long> legate::PhysicalStore::shape<1>() const at unknown file:0 [00007fda43730a72]        
  [8] = Legion::FieldAccessor<(legion_privilege_mode_t)1, double, 1, long long, Realm::AffineAccessor<double, 1, long long>, false> legate::PhysicalStore::read_accessor<double, 1, false>() const at unknown file:0 [00007fda43fda590]      
  [9] = cunumeric::WriteTask::cpu_variant(legate::TaskContext) at unknown file:0 [00007fda44085534]                   
  [10] = legate::detail::task_wrapper(void (*)(legate::TaskContext), legate_core_variant_t, std::optional<std::basic_string_view<char, std::char_traits<char> > >, void const*, unsigned long, void const*, unsigned long, Realm::Processor) at unknown file:0 [00007fda66ed3f82]
  [11] = void legate::LegateTask<cunumeric::WriteTask>::task_wrapper_<&cunumeric::WriteTask::cpu_variant, (legate_core_variant_t)1>(void const*, unsigned long, void const*, unsigned long, Realm::Processor) at unknown file:0 [00007fda4408423a]
  [12] = Realm::Task::execute_on_processor(Realm::Processor) at unknown file:0 [00007fdbaaf385d0]                     
  [13] = Realm::UserThreadTaskScheduler::execute_task(Realm::Task*) at unknown file:0 [00007fdbaaf38665]              
  [14] = Realm::ThreadedTaskScheduler::scheduler_loop() at unknown file:0 [00007fdbaaf36c19]                          
  [15] = Realm::UserThread::uthread_entry() at unknown file:0 [00007fdbaaf3d716]                                      
  [16] = unknown symbol at unknown file:0 [00007fdbaa6770af]                                                          

Also this code calls np.inv and np.correlate which seem to be missing in cunumeric.

manopapad commented 3 months ago

I am testing out a solution to the assertion failure.

manopapad commented 3 months ago

The bug is fixed. The fix should be included in an upcoming build.

manopapad commented 1 month ago

The bugfix was included in patch release 24.06.01