Open octo47 opened 9 years ago
Could you please write a unit test to exemplify your point. We are trying to rely on internal HBase locks for consistent updates of the namespaceon a RegionServer.
HBase would not help there, because it guarantee only consistent updates within one row. But we are checking and modifying several rows (mkdirs create all parents for file, that is definitely more then one row), so any other process can do delete() on any of our parent and will not see our udpates, so will not delete them. Thats true for file/directory renames too. Actually there many places where we need locking or CAS versioning. Any places where we are doing read, check, write, we can fail. For example create() method full of such unexpected behaviours.
if(iFile != null) {
if(iFile.isDir()) {
throw new FileAlreadyExistsException(
"File already exists as directory: " + src);
} else if(overwrite) {
if(!deleteFile(iFile, true)) {
throw new IOException("Cannot override existing file: " + src);
}
} else if(iFile.getFileState().equals(FileState.UNDER_CONSTRUCTION)) {
// Opening an existing file for write - may need to recover lease.
reassignLease(iFile, src, clientName, false);
} else {
throw new FileAlreadyExistsException();
}
}
This code without lock have many places where we could get inconsistent state:
As for now we have no locks for doing mutations on tree. It is possible to directory be deleted while we creating file or directories (mkdirs) so that will put filesystem in inconsistent state, which can lead to phantom directories or files (they will not be visible my listing them, but it will not be possible to create file due of the fact that it is already there).