Open kjsanger opened 4 weeks ago
Thank you for reporting the issue.
First, I updated the code to not throw FNFE error when file does not exist during deletion.
However, I don't understand the flow. It first stat
to check if the file exists or not. As it didn't return error, fell into else
branch, meaning "the file exists".
Then the code tried to remove the file but getting "the file does not exist" error which contradicts to previous response. This maybe related to database transaction
, resource server-side caching
, or go-irodsclient caching
. Can you check if the file actually exists before calling fs.RemoveFile
?
I agree, it looks like it shouldn't happen. However, pausing the tests in the debugger just after the error has been raised shows the file does not actually exist in iRODS (by inspecting with ils
).
The debugger shows Stat
returning a cached value for the file that was previously deleted (cachedEntry := fs.cache.GetEntryCache(irodsPath)
). The removal was done using fs.RemoveDir(workColl, true, true))
and the FileSystem
instance used to delete the collection previously is the same one as is being used to upload the file again.
Yes, https://github.com/cyverse/go-irodsclient/commit/0b74740d1c8667fcd1c1565afb15a223b85692d7 does work around the issue.
Can you share the test code failing? I'd like to take a closer look into it.
Here's a standalone reproducer (for v0.15.6
).
package main
import (
"fmt"
"github.com/cyverse/go-irodsclient/config"
"github.com/cyverse/go-irodsclient/fs"
"github.com/cyverse/go-irodsclient/irods/types"
"os"
)
var (
iRODSEnvFile = "test_irods_environment.json"
iRODSAuthFile = "test_auth_file"
iRODSPassword = "irods"
testZone = "testZone"
rootColl = fmt.Sprintf("/%s/home/irods", testZone)
testColl = fmt.Sprintf("%s/%s", rootColl, "repro")
testFile = "test.txt"
localPath = testFile
remotePath = fmt.Sprintf("%s/%s", testColl, testFile)
)
func main() {
var manager *config.ICommandsEnvironmentManager
var err error
if manager, err = config.NewICommandsEnvironmentManager(); err != nil {
panic(err)
}
if err = manager.SetEnvironmentFilePath(iRODSEnvFile); err != nil {
panic(err)
}
if err = manager.Load(); err != nil {
panic(err)
}
manager.Environment.Password = iRODSPassword
manager.PasswordFilePath = iRODSAuthFile
// If there's no auth file, create one
if _, err = os.Stat(iRODSAuthFile); err != nil {
if os.IsNotExist(err) {
obfuscator := config.NewPasswordObfuscator()
obfuscator.SetUID(manager.UID)
err = obfuscator.EncodeToFile(iRODSAuthFile, []byte(iRODSPassword))
if err != nil {
panic(err)
}
fmt.Fprintf(os.Stderr, "Created iRODS auth file at %s\n", iRODSAuthFile)
} else {
panic(err)
}
}
var account *types.IRODSAccount
if account, err = manager.ToIRODSAccount(); err != nil {
panic(err)
}
var ifs *fs.FileSystem
ifs, err = fs.NewFileSystemWithDefault(account, "repro")
if err != nil {
panic(err)
}
// The core of the test; put a file to iRODS, then remove the collection it is in,
// recursively. Repeat this 10 times.
for i := range 100 {
if err = ifs.MakeDir(testColl, true); err != nil {
fmt.Fprintf(os.Stderr, "Iteration %d, make '%s' failed: %s\n", i, testColl, err)
panic(err)
} else {
fmt.Fprintf(os.Stderr, "Iteration %d, make '%s' succeeded\n", i, testColl)
}
fmt.Fprintf(os.Stderr, "Iteration %d, putting '%s' to '%s'\n", i, localPath, remotePath)
_, err = ifs.UploadFile(localPath, remotePath, "", false, true, true, nil)
if err != nil {
fmt.Fprintf(os.Stderr, "Iteration %d, put '%s' to '%s' failed: %s\n", i,
localPath, remotePath, err)
panic(err)
} else {
fmt.Fprintf(os.Stderr, "Iteration %d, put '%s' to '%s' succeeded\n", i,
localPath, remotePath)
}
_, err = ifs.Stat(remotePath)
if err != nil {
fmt.Fprintf(os.Stderr, "Iteration %d, stat '%s' failed: %s\n", i, remotePath, err)
panic(err)
} else {
fmt.Fprintf(os.Stderr, "Iteration %d, stat '%s' succeeded\n", i, remotePath)
}
err = ifs.RemoveDir(testColl, true, true)
if err != nil {
fmt.Fprintf(os.Stderr, "Iteration %d, remove '%s' failed: %s\n", i, testColl, err)
panic(err)
} else {
fmt.Fprintf(os.Stderr, "Iteration %d, remove '%s' succeeded\n", i, testColl)
}
}
}
Output on iRODS 4.2.11:
Iteration 0, make '/testZone/home/irods/repro' succeeded
Iteration 0, putting 'test.txt' to '/testZone/home/irods/repro/test.txt'
Iteration 0, put 'test.txt' to '/testZone/home/irods/repro/test.txt' succeeded
Iteration 0, stat '/testZone/home/irods/repro/test.txt' succeeded
Iteration 0, remove '/testZone/home/irods/repro' succeeded
Iteration 1, make '/testZone/home/irods/repro' succeeded
Iteration 1, putting 'test.txt' to '/testZone/home/irods/repro/test.txt'
Iteration 1, put 'test.txt' to '/testZone/home/irods/repro/test.txt' failed: failed to remove data object "/testZone/home/irods/repro/test.txt" for overwrite: failed to find the data object for path "/testZone/home/irods/repro/test.txt": data object/collection not found for path "/testZone/home/irods/repro/test.txt"
panic: failed to remove data object "/testZone/home/irods/repro/test.txt" for overwrite: failed to find the data object for path "/testZone/home/irods/repro/test.txt": data object/collection not found for path "/testZone/home/irods/repro/test.txt"
goroutine 1 [running]:
main.main()
/Users/kdj/Development/go/projects/sqyrrl/repro.go:83 +0x7c8
Process finished with the exit code 2
In the server log I see:
Nov 18 15:19:14 pid:440 NOTICE: [rsDataObjUnlink_impl:386] - [CAT_NO_ROWS_FOUND: [resolve_resource_hierarchy:390] - failed in file_object_factory:[[-] /repos/irods/server/core/src/irods_file_object.cpp:359:irods::error irods::file_object_factory(rsComm_t *, dataObjInp_t *, irods::file_object_ptr, dataObjInfo_t **) : status [CAT_NO_ROWS_FOUND] errno [] -- message [failed in call to getDataObjInfoIncSpecColl for [/testZone/home/irods/repro/test.txt] CAT_NO_ROWS_FOUND ]
]
]
Nov 18 15:19:14 pid:440 remote addresses: 172.20.0.1, ::1 ERROR: [-] /repos/irods/server/api/src/rsDataObjUnlink.cpp:415:int (anonymous namespace)::rsDataObjUnlink_impl(rsComm_t *, dataObjInp_t *) : status [CAT_NO_ROWS_FOUND] errno [] -- message [rsDataObjUnlink_impl - Failed to get data objects. - CAT_NO_ROWS_FOUND ]
Nov 18 15:19:14 pid:440 remote addresses: 172.20.0.1, ::1 ERROR: [-] /repos/irods/server/core/src/rsApiHandler.cpp:542:int readAndProcClientMsg(rsComm_t *, int) : status [SYS_HEADER_READ_LEN_ERR] errno [] -- message [failed to call 'read header']
[-] /repos/irods/lib/core/src/sockComm.cpp:198:irods::error readMsgHeader(irods::network_object_ptr, msgHeader_t *, struct timeval *) : status [SYS_HEADER_READ_LEN_ERR] errno [] -- message [failed to call 'read header']
[-] /repos/irods/plugins/network/tcp/libtcp.cpp:194:irods::error tcp_read_msg_header(irods::plugin_context &, void *, struct timeval *) : status [SYS_HEADER_READ_LEN_ERR] errno [] -- message [only read [0] of [4]]
Nov 18 15:19:14 pid:444 remote addresses: 172.20.0.1, ::1 ERROR: [-] /repos/irods/server/core/src/rsApiHandler.cpp:542:int readAndProcClientMsg(rsComm_t *, int) : status [SYS_HEADER_READ_LEN_ERR] errno [] -- message [failed to call 'read header']
[-] /repos/irods/lib/core/src/sockComm.cpp:198:irods::error readMsgHeader(irods::network_object_ptr, msgHeader_t *, struct timeval *) : status [SYS_HEADER_READ_LEN_ERR] errno [] -- message [failed to call 'read header']
[-] /repos/irods/plugins/network/tcp/libtcp.cpp:194:irods::error tcp_read_msg_header(irods::plugin_context &, void *, struct timeval *) : status [SYS_HEADER_READ_LEN_ERR] errno [] -- message [only read [0] of [4]]
Nov 18 15:19:14 pid:442 remote addresses: 172.20.0.1, ::1 ERROR: [-] /repos/irods/server/core/src/rsApiHandler.cpp:542:int readAndProcClientMsg(rsComm_t *, int) : status [SYS_HEADER_READ_LEN_ERR] errno [] -- message [failed to call 'read header']
[-] /repos/irods/lib/core/src/sockComm.cpp:198:irods::error readMsgHeader(irods::network_object_ptr, msgHeader_t *, struct timeval *) : status [SYS_HEADER_READ_LEN_ERR] errno [] -- message [failed to call 'read header']
[-] /repos/irods/plugins/network/tcp/libtcp.cpp:194:irods::error tcp_read_msg_header(irods::plugin_context &, void *, struct timeval *) : status [SYS_HEADER_READ_LEN_ERR] errno [] -- message [only read [0] of [4]]
Nov 18 15:19:14 pid:444 remote addresses: 172.20.0.1 ERROR: Agent [444] exiting with status = -4000
Nov 18 15:19:14 pid:440 remote addresses: 172.20.0.1 ERROR: Agent [440] exiting with status = -4000
Nov 18 15:19:14 pid:442 remote addresses: 172.20.0.1 ERROR: Agent [442] exiting with status = -4000
Nov 18 15:19:14 pid:64 ERROR: Agent process [440] exited with status [96]
Nov 18 15:19:14 pid:64 ERROR: Agent process [442] exited with status [96]
Nov 18 15:19:14 pid:64 ERROR: Agent process [444] exited with status [96]
Thank you! I'll take a look.
v0.15.6, iRODS 4.2.7, iRODS 4.3.3
Hi,
While using
UploadFile
to stage a file for unit tests, I've noticed what may be a bug. I see errors like this:Looking at the logic within
UploadFile
I can see that it accepts thatStat
on the remote path may yield a FNFE and tests for that before proceeding:https://github.com/cyverse/go-irodsclient/blob/b40b30f65857b00c12ec664218ce6d33c27d8efc/fs/fs_bulk.go#L547-L562
so in the
else
branch the remote path may, or may not, have a file already there. If the remote path is a collection, it handles that, otherwise it callsRemoveFile
on the remote path.RemoveFile
callsDeleteDataObject
with theforce
flag, so all should be OK. However, the "failed to find the data object for path ..." message is a new FNFE raised byDeleteDataObject
, despite theforce
flag, which doesn't seem right.I wonder if
DeleteDataObject
should be raising FNFE ifforce
is true because I'd expect that to be whatforce
is allowing here?