Open howard-3 opened 1 year ago
Tablesaw makes no claim of thread safety
On Tue, Aug 29, 2023 at 12:02 PM howard-3 @.***> wrote:
Example code to reproduce
import tech.tablesaw.api.DoubleColumn; import tech.tablesaw.api.StringColumn;
public class Test {
public static void main(String[] args) throws Exception { // uncomment the next line to prevent the initialization deadlock. // ColumnType.values(); Runnable r1 = () -> { StringColumn.create("abc"); }; Runnable r2 = () -> { DoubleColumn.create("def"); }; Thread t1 = new Thread(r1); Thread t2 = new Thread(r2); System.out.println("Starting"); t1.start(); t2.start(); t1.join(); t2.join(); System.out.println("Done"); }
}
Constructor flow
graph TD; StringColumn --> |in constructor refers to| StringColumnType StringColumnType --> |is an extension of| AbstractColumnType AbstractColumnType --> |is an impl of|ColumnType ColumnType --> |initializes the final values of|StringColumnType
The root cause seems to be ColumnType class referencing values of many *ColumnType classes.
When you have StringColumn and other ColumnTypes being constructed for the first time concurrently. One thread can hold the class initialization lock for StringColumn (and StringColumnType), and the other thread can hold the one for DoubleColumn and DoubleColumnType. In that scenario, StringColumn cannot finish initialization because it depends on the init lock for ColumnType which in turn relies on DoubleColumnType.
My temporary fix is simply to ensure the class for ColumnType is loaded first.
I'm happy to contribute a fix, but not sure what's the best approach here? Maybe move all the initialization for the different *ColumnTypes to a new class?
— Reply to this email directly, view it on GitHub https://github.com/jtablesaw/tablesaw/issues/1230, or unsubscribe https://github.com/notifications/unsubscribe-auth/AA2FPAS73S4EQST4KYMLEJ3XXYHDBANCNFSM6AAAAAA4DF56FE . You are receiving this because you are subscribed to this thread.Message ID: @.***>
Example code to reproduce
Constructor flow
The root cause seems to be
ColumnType
class referencing values of many*ColumnType
classes.When you have
StringColumn
and other ColumnTypes being constructed for the first time concurrently. One thread can hold the class initialization lock forStringColumn
(andStringColumnType
), and the other thread can hold the one forDoubleColumn
andDoubleColumnType
. In that scenario,StringColumn
cannot finish initialization because it depends on the init lock forColumnType
which in turn relies onDoubleColumnType
.My temporary fix is simply to ensure the class for
ColumnType
is loaded first.Relevant JVM tickets: https://bugs.openjdk.org/browse/JDK-8037567
I'm happy to contribute a fix, but not sure what's the best approach here? Maybe move all the initialization for the different *ColumnTypes to a new class?