Jingwang and Xuezhao while working on the fast forward project just found some nasty race conditions in write file! Jingwang, please correct me if I get anything wrong here or omit anything. Here’s what Jingwang just wrote me about this:
"I think this is a very critical path in PLFS, because all of the write operations will go through the code of WriteFile. After reviewing the source code of WriteFile, I think that it has more problems when multiple threads are considered, such as unprotected read on a member which could be shared between threads. It needs more detailed examination to ensure thread-safety. Xuezhao's case brings up a whole class of serious problems under multiple threads.”
We had decided that a class-level pthread_rwlock_t would be the safest approach since it is only modified when dropping files are added so, most of the time, getting the read locks will not limit any parallelism.
[We also had a more general discussion about how much thread parallelism is important to PLFS. For large HPC and for MPI jobs, typically parallelism is achieved through multi-process so if we had a single-threaded PLFS, that wouldn’t hurt performance. But systems are going to more cores and we have small performance critical jobs like archive transfer jobs, so I think we should try to care about thread-parallelism to some degree as well. Also, in a lot of the burst buffer future planning, PLFS is being moved off the compute nodes onto the burst buffer nodes where they will aggregate requests from many ranks on many compute nodes. They will likely do this as threads within a single PLFS process so that also means that we need to continue caring about thread-parallelism within PLFS.]
Thanks,
John
On Apr 2, 2014, at 9:20 AM, Bonnie, Dave dbonnie@lanl.gov wrote:
I would lean towards locking it on any access - STL maps are not thread-safe. If you search it while it is being updated it could lead to a situation where you return an invalid OpenFh.
In particular, iterators from STL maps are what were causing many of the thread-safety errors I fixed in the FUSE daemon last year.
Dave
On Apr 2, 2014, at 9:10 AM, Chuck Cranor chuck@ece.cmu.edu
wrote:
hi-
looking at the current code: in WriteFile class we have:
map< pid_t, OpenFh > fhs;
pthread_mutex_t data_mux; // to access our map of fds
in the WriteFile code when we modify fhs, we lock data_mux.
Q: is it save to traverse fhs without locking it? consider:
Looks like writefile needs a class-level pthread_rwlock_t to protect against race conditions:
From a plfs-devel mailing list exchange (http://sourceforge.net/p/plfs/mailman/message/32177946/):
Jingwang and Xuezhao while working on the fast forward project just found some nasty race conditions in write file! Jingwang, please correct me if I get anything wrong here or omit anything. Here’s what Jingwang just wrote me about this:
"I think this is a very critical path in PLFS, because all of the write operations will go through the code of WriteFile. After reviewing the source code of WriteFile, I think that it has more problems when multiple threads are considered, such as unprotected read on a member which could be shared between threads. It needs more detailed examination to ensure thread-safety. Xuezhao's case brings up a whole class of serious problems under multiple threads.”
We had decided that a class-level pthread_rwlock_t would be the safest approach since it is only modified when dropping files are added so, most of the time, getting the read locks will not limit any parallelism.
[We also had a more general discussion about how much thread parallelism is important to PLFS. For large HPC and for MPI jobs, typically parallelism is achieved through multi-process so if we had a single-threaded PLFS, that wouldn’t hurt performance. But systems are going to more cores and we have small performance critical jobs like archive transfer jobs, so I think we should try to care about thread-parallelism to some degree as well. Also, in a lot of the burst buffer future planning, PLFS is being moved off the compute nodes onto the burst buffer nodes where they will aggregate requests from many ranks on many compute nodes. They will likely do this as threads within a single PLFS process so that also means that we need to continue caring about thread-parallelism within PLFS.]
Thanks,
John
On Apr 2, 2014, at 9:20 AM, Bonnie, Dave dbonnie@lanl.gov wrote:
I would lean towards locking it on any access - STL maps are not thread-safe. If you search it while it is being updated it could lead to a situation where you return an invalid OpenFh.
In particular, iterators from STL maps are what were causing many of the thread-safety errors I fixed in the FUSE daemon last year.
On Apr 2, 2014, at 9:10 AM, Chuck Cranor chuck@ece.cmu.edu wrote:
hi-
looking at the current code: in WriteFile class we have:
in the WriteFile code when we modify fhs, we lock data_mux.
Q: is it save to traverse fhs without locking it? consider:
Container_fd::write(const char buf, size_t size, off_t offset, pid_t pid, ssize_t bytes_written) {
// ... plfs_error_t ret = PLFS_SUCCESS; ssize_t written; WriteFile *wf = pfd->getWritefile(); ret = wf->write(buf, size, offset, pid, &written); // ...
}
ok, now within WriteFile::write we do:
plfs_error_t WriteFile::write(const char buf, size_t size, off_t offset, pid_t pid, ssize_t bytes_written) {
plfs_error_t ret = PLFS_SUCCESS;
ssize_t written = 0;
that "getFh(pid)" searches fhs without locking data_mux:
struct OpenFh WriteFile::getFh( pid_t pid ) { map<pid_t,OpenFh>::iterator itr; struct OpenFh ofh = NULL; if ( (itr = fhs.find( pid )) != fhs.end() ) {
is it safe to call fhs.find(pid) without holding data_mux?
I'm trying to map this stuff into the new Index framework and wasn't sure about this one.
chuck
Plfs-devel mailing list Plfs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/plfs-devel