GraphBLAS / graphblas-api-c

Other
7 stars 3 forks source link

thread-safety of GrB_error is a non-issue. #61

Closed DrTimothyAldenDavis closed 2 years ago

DrTimothyAldenDavis commented 2 years ago

Line 2795 states:

This is a
thread-safe function, in the sense that multiple threads can call it simultaneously and each will get
its own error string back, referring to the object passed as obj. [Scott: Is this really true?]

This is not true exactly, but it is still thread safe unless one thread decides to free or modify the underlying object. The error string is owned by the object. If two user threads want to read it and not modify the obj, this would be thread safe although I suppose I would need to do an internal GrB_wait (obj, GrB_COMPLETE) to make sure this is true (that is just a #pragma omp flush, which I can do when writing the error string to the object).

What is false about this statement: each thread will not get its own error string. They will all read the one error string that is owned by the object. There is only one error string. I don't see how GrB_error from 2 threads can safely create 2 error strings holding the same content; that would be hard to do and useless as well.

Is it important that this function be thread safe, itself? The only time an object can have error string attached to an output (like C of GrB_mxm (C, ...)). An error string cannot safely be attached to an input matrix since other threads might be using that as a read-only object on which GrB_wait (A, ...) has been called. So only the owning user thread would want to do GrB_error (&err, C) on this matrix C anyway. If the user thread first does GrB_wait (C, ...), so C is safe for other threads to use, then this would clear the error string anyway, per the spec for GrB_error.

Since one thread can't access another thread's output matrix C without using GrB_wait first, and since GrB_wait clears the error string (line 2794),


The string that is returned is owned by obj and will be valid until the next time obj is
used as an OUT or INOUT parameter or the object is freed by a call to GrB_free(obj).

then I see no way that two user threads can hope to read the same error from the same matrix C.

mcmillan03 commented 2 years ago

Assigning to Tim M. I believe it is accurate to just remove the sentence in red.

tgmattso commented 2 years ago

I will go into the document and check the text in question.

... but I suspect there is some confusion on what the term "thread safety" means. It is not about concurrency with multiple threads operating on overlapping memory regions. Thread safety says threads can call an API function on non-overlapping memory regions (i.e. separate objects) and everything will work. If two threads look at the error string on the same object at the same time, strange things can happen.

Our error function are thread safe. That is correct. But there are all sorts of ways things can go wrong if multiple threads mix reads and writes to the same object. For example, if two threads read the error code attached to an object and then one thread tries to update or free that object, all hell can break loose. Thread safety does not enter in there.

The classic example of thread safety is with printf(). My output records can overlap when I have multiple threads writing to stdout. But as long as there are no data races in the user's program, each individual recored is correctly written to the screen.

tgmattso commented 2 years ago

The text in Red on lines 2792 to 2795 are incorrect. We do not promise that each thread will get its own string. That is too strong of a statement and improperly constrains what an implementation can do. the correct text is:

This is a thread-safe function. It can be safely called by multiple threads for the same object in a race-free program.