Open d3v-null opened 2 years ago
This looks like it will be fun! It also looks like the crash is in some test code outside of Rubbl, so let me know if/how I can help ...
all good, I appreciate the moral support :P
so effectively, both threads 11 and 12 are instantiating the ArrayColumn object for column OFFSET
of table /home/dev/Marlu/tests/data/1254670392_avg/1254670392.cotter.none.trunc.ms/ANTENNA
at the same address baseTabPtr_p = 0x7fffc804d300
like this
casacore::ArrayColumn<CPPTYPE> col(table, bridge_string(col_name))
and they crash here:
the reason it's the same address is it's probably getting it from TableCache.
So what should we do? I guess we have a few options:
Column
instantiation mutex to prevent two threads accessing the same column from the same table addressSigh. I guess we're going to need some kind of global mutex — I think it's unacceptable to have known usage patterns that can cause crashes.
Coming from there, I suspect that your third suggestion is the best way to go — if the table cache is causing problems, and I suspect that it is, I think that we'll want to enforce single-threaded access in a way that tracks the cache's indexing. I'd guess that a mutex restricted to Columns in particular might solve the current problem but that there would be other similar problems cropping up with other use cases.
Whatever the solution is, we should make sure to describe it well in the API docs in the appropriate places.
sigh here we go again. 🙄
it's the end of the day for me, but I'll keep investigating this tomorrow.
here's the trace
areas of interest:
glue.cc:1457
casacore/tables/Tables/ColumnDesc.cc:82
info args