cyverse / go-irodsclient

Go iRODS Client
Other
11 stars 6 forks source link

v0.15.6: UploadFile correctly ignores FileNotFoundError when overwriting, but delegates to RemoveFile which raises FileNotFoundError #50

Open kjsanger opened 4 weeks ago

kjsanger commented 4 weeks ago

v0.15.6, iRODS 4.2.7, iRODS 4.3.3

Hi,

While using UploadFile to stage a file for unit tests, I've noticed what may be a bug. I see errors like this:

 [FAILED] Unexpected error:
      <*xerrors.wrapError | 0x14000477c40>: 
      failed to remove data object "/testZone/home/irods/iRODSGetHandler.5552.2447319374/test.txt" for overwrite: failed to find the data object for path "/testZone/home/irods/iRODSGetHandler.5552.2447319374/test.txt": data object/collection not found for path "/testZone/home/irods/iRODSGetHandler.5552.2447319374/test.txt"
      {
          msg: "failed to remove data object \"/testZone/home/irods/iRODSGetHandler.5552.2447319374/test.txt\" for overwrite",
          err: <*xerrors.wrapError | 0x14000477540>{
              msg: "failed to find the data object for path \"/testZone/home/irods/iRODSGetHandler.5552.2447319374/test.txt\"",
              err: <*types.FileNotFoundError | 0x14000488ae0>{
                  Path: "/testZone/home/irods/iRODSGetHandler.5552.2447319374/test.txt",
              },
              frame: {
                  frames: [0x104ad23f4, 0x104b6b5ac, 0x104ba38e8],
              },
          },
          frame: {
              frames: [0x104ad23f4, 0x104bad410, 0x104d0703c],
          },
      }
  occurred
  In [BeforeEach] at: /Users/kdj/Development/go/projects/sqyrrl/server/handlers_test.go:117 @ 10/31/24 09:39:29.069

Looking at the logic within UploadFile I can see that it accepts that Stat on the remote path may yield a FNFE and tests for that before proceeding:

https://github.com/cyverse/go-irodsclient/blob/b40b30f65857b00c12ec664218ce6d33c27d8efc/fs/fs_bulk.go#L547-L562

so in the else branch the remote path may, or may not, have a file already there. If the remote path is a collection, it handles that, otherwise it calls RemoveFile on the remote path. RemoveFile calls DeleteDataObject with the force flag, so all should be OK. However, the "failed to find the data object for path ..." message is a new FNFE raised by DeleteDataObject, despite the force flag, which doesn't seem right.

I wonder if DeleteDataObject should be raising FNFE if force is true because I'd expect that to be what force is allowing here?

iychoi commented 3 weeks ago

Thank you for reporting the issue.

First, I updated the code to not throw FNFE error when file does not exist during deletion.

However, I don't understand the flow. It first stat to check if the file exists or not. As it didn't return error, fell into else branch, meaning "the file exists". Then the code tried to remove the file but getting "the file does not exist" error which contradicts to previous response. This maybe related to database transaction, resource server-side caching, or go-irodsclient caching. Can you check if the file actually exists before calling fs.RemoveFile?

kjsanger commented 3 weeks ago

I agree, it looks like it shouldn't happen. However, pausing the tests in the debugger just after the error has been raised shows the file does not actually exist in iRODS (by inspecting with ils). The debugger shows Stat returning a cached value for the file that was previously deleted (cachedEntry := fs.cache.GetEntryCache(irodsPath)). The removal was done using fs.RemoveDir(workColl, true, true)) and the FileSystem instance used to delete the collection previously is the same one as is being used to upload the file again.

kjsanger commented 3 weeks ago

Yes, https://github.com/cyverse/go-irodsclient/commit/0b74740d1c8667fcd1c1565afb15a223b85692d7 does work around the issue.

iychoi commented 3 weeks ago

Can you share the test code failing? I'd like to take a closer look into it.

kjsanger commented 1 week ago

Here's a standalone reproducer (for v0.15.6).

package main

import (
    "fmt"
    "github.com/cyverse/go-irodsclient/config"
    "github.com/cyverse/go-irodsclient/fs"
    "github.com/cyverse/go-irodsclient/irods/types"
    "os"
)

var (
    iRODSEnvFile  = "test_irods_environment.json"
    iRODSAuthFile = "test_auth_file"
    iRODSPassword = "irods"

    testZone = "testZone"
    rootColl = fmt.Sprintf("/%s/home/irods", testZone)
    testColl = fmt.Sprintf("%s/%s", rootColl, "repro")

    testFile   = "test.txt"
    localPath  = testFile
    remotePath = fmt.Sprintf("%s/%s", testColl, testFile)
)

func main() {
    var manager *config.ICommandsEnvironmentManager
    var err error

    if manager, err = config.NewICommandsEnvironmentManager(); err != nil {
        panic(err)
    }
    if err = manager.SetEnvironmentFilePath(iRODSEnvFile); err != nil {
        panic(err)
    }
    if err = manager.Load(); err != nil {
        panic(err)
    }

    manager.Environment.Password = iRODSPassword
    manager.PasswordFilePath = iRODSAuthFile

    // If there's no auth file, create one
    if _, err = os.Stat(iRODSAuthFile); err != nil {
        if os.IsNotExist(err) {
            obfuscator := config.NewPasswordObfuscator()
            obfuscator.SetUID(manager.UID)
            err = obfuscator.EncodeToFile(iRODSAuthFile, []byte(iRODSPassword))
            if err != nil {
                panic(err)
            }
            fmt.Fprintf(os.Stderr, "Created iRODS auth file at %s\n", iRODSAuthFile)
        } else {
            panic(err)
        }
    }

    var account *types.IRODSAccount
    if account, err = manager.ToIRODSAccount(); err != nil {
        panic(err)
    }

    var ifs *fs.FileSystem
    ifs, err = fs.NewFileSystemWithDefault(account, "repro")
    if err != nil {
        panic(err)
    }

    // The core of the test; put a file to iRODS, then remove the collection it is in,
    // recursively. Repeat this 10 times.
    for i := range 100 {
        if err = ifs.MakeDir(testColl, true); err != nil {
            fmt.Fprintf(os.Stderr, "Iteration %d, make '%s' failed: %s\n", i, testColl, err)
            panic(err)
        } else {
            fmt.Fprintf(os.Stderr, "Iteration %d, make '%s' succeeded\n", i, testColl)
        }

        fmt.Fprintf(os.Stderr, "Iteration %d, putting '%s' to '%s'\n", i, localPath, remotePath)
        _, err = ifs.UploadFile(localPath, remotePath, "", false, true, true, nil)
        if err != nil {
            fmt.Fprintf(os.Stderr, "Iteration %d, put '%s' to '%s' failed: %s\n", i,
                localPath, remotePath, err)
            panic(err)
        } else {
            fmt.Fprintf(os.Stderr, "Iteration %d, put '%s' to '%s' succeeded\n", i,
                localPath, remotePath)
        }

        _, err = ifs.Stat(remotePath)
        if err != nil {
            fmt.Fprintf(os.Stderr, "Iteration %d, stat '%s' failed: %s\n", i, remotePath, err)
            panic(err)
        } else {
            fmt.Fprintf(os.Stderr, "Iteration %d, stat '%s' succeeded\n", i, remotePath)
        }

        err = ifs.RemoveDir(testColl, true, true)
        if err != nil {
            fmt.Fprintf(os.Stderr, "Iteration %d, remove '%s' failed: %s\n", i, testColl, err)
            panic(err)
        } else {
            fmt.Fprintf(os.Stderr, "Iteration %d, remove '%s' succeeded\n", i, testColl)
        }
    }
}

Output on iRODS 4.2.11:

Iteration 0, make '/testZone/home/irods/repro' succeeded
Iteration 0, putting 'test.txt' to '/testZone/home/irods/repro/test.txt'
Iteration 0, put 'test.txt' to '/testZone/home/irods/repro/test.txt' succeeded
Iteration 0, stat '/testZone/home/irods/repro/test.txt' succeeded
Iteration 0, remove '/testZone/home/irods/repro' succeeded
Iteration 1, make '/testZone/home/irods/repro' succeeded
Iteration 1, putting 'test.txt' to '/testZone/home/irods/repro/test.txt'
Iteration 1, put 'test.txt' to '/testZone/home/irods/repro/test.txt' failed: failed to remove data object "/testZone/home/irods/repro/test.txt" for overwrite: failed to find the data object for path "/testZone/home/irods/repro/test.txt": data object/collection not found for path "/testZone/home/irods/repro/test.txt"
panic: failed to remove data object "/testZone/home/irods/repro/test.txt" for overwrite: failed to find the data object for path "/testZone/home/irods/repro/test.txt": data object/collection not found for path "/testZone/home/irods/repro/test.txt"

goroutine 1 [running]:
main.main()
    /Users/kdj/Development/go/projects/sqyrrl/repro.go:83 +0x7c8

Process finished with the exit code 2

In the server log I see:

Nov 18 15:19:14 pid:440 NOTICE: [rsDataObjUnlink_impl:386] - [CAT_NO_ROWS_FOUND: [resolve_resource_hierarchy:390] - failed in file_object_factory:[[-]  /repos/irods/server/core/src/irods_file_object.cpp:359:irods::error irods::file_object_factory(rsComm_t *, dataObjInp_t *, irods::file_object_ptr, dataObjInfo_t **) :  status [CAT_NO_ROWS_FOUND]  errno [] -- message [failed in call to getDataObjInfoIncSpecColl for [/testZone/home/irods/repro/test.txt] CAT_NO_ROWS_FOUND ]

]

]
Nov 18 15:19:14 pid:440 remote addresses: 172.20.0.1, ::1 ERROR: [-]    /repos/irods/server/api/src/rsDataObjUnlink.cpp:415:int (anonymous namespace)::rsDataObjUnlink_impl(rsComm_t *, dataObjInp_t *) :  status [CAT_NO_ROWS_FOUND]  errno [] -- message [rsDataObjUnlink_impl - Failed to get data objects. - CAT_NO_ROWS_FOUND ]

Nov 18 15:19:14 pid:440 remote addresses: 172.20.0.1, ::1 ERROR: [-]    /repos/irods/server/core/src/rsApiHandler.cpp:542:int readAndProcClientMsg(rsComm_t *, int) :  status [SYS_HEADER_READ_LEN_ERR]  errno [] -- message [failed to call 'read header']
        [-]     /repos/irods/lib/core/src/sockComm.cpp:198:irods::error readMsgHeader(irods::network_object_ptr, msgHeader_t *, struct timeval *) :  status [SYS_HEADER_READ_LEN_ERR]  errno [] -- message [failed to call 'read header']
                [-]     /repos/irods/plugins/network/tcp/libtcp.cpp:194:irods::error tcp_read_msg_header(irods::plugin_context &, void *, struct timeval *) :  status [SYS_HEADER_READ_LEN_ERR]  errno [] -- message [only read [0] of [4]]

Nov 18 15:19:14 pid:444 remote addresses: 172.20.0.1, ::1 ERROR: [-]    /repos/irods/server/core/src/rsApiHandler.cpp:542:int readAndProcClientMsg(rsComm_t *, int) :  status [SYS_HEADER_READ_LEN_ERR]  errno [] -- message [failed to call 'read header']
        [-]     /repos/irods/lib/core/src/sockComm.cpp:198:irods::error readMsgHeader(irods::network_object_ptr, msgHeader_t *, struct timeval *) :  status [SYS_HEADER_READ_LEN_ERR]  errno [] -- message [failed to call 'read header']
                [-]     /repos/irods/plugins/network/tcp/libtcp.cpp:194:irods::error tcp_read_msg_header(irods::plugin_context &, void *, struct timeval *) :  status [SYS_HEADER_READ_LEN_ERR]  errno [] -- message [only read [0] of [4]]

Nov 18 15:19:14 pid:442 remote addresses: 172.20.0.1, ::1 ERROR: [-]    /repos/irods/server/core/src/rsApiHandler.cpp:542:int readAndProcClientMsg(rsComm_t *, int) :  status [SYS_HEADER_READ_LEN_ERR]  errno [] -- message [failed to call 'read header']
        [-]     /repos/irods/lib/core/src/sockComm.cpp:198:irods::error readMsgHeader(irods::network_object_ptr, msgHeader_t *, struct timeval *) :  status [SYS_HEADER_READ_LEN_ERR]  errno [] -- message [failed to call 'read header']
                [-]     /repos/irods/plugins/network/tcp/libtcp.cpp:194:irods::error tcp_read_msg_header(irods::plugin_context &, void *, struct timeval *) :  status [SYS_HEADER_READ_LEN_ERR]  errno [] -- message [only read [0] of [4]]

Nov 18 15:19:14 pid:444 remote addresses: 172.20.0.1 ERROR: Agent [444] exiting with status = -4000
Nov 18 15:19:14 pid:440 remote addresses: 172.20.0.1 ERROR: Agent [440] exiting with status = -4000
Nov 18 15:19:14 pid:442 remote addresses: 172.20.0.1 ERROR: Agent [442] exiting with status = -4000
Nov 18 15:19:14 pid:64  ERROR: Agent process [440] exited with status [96]
Nov 18 15:19:14 pid:64  ERROR: Agent process [442] exited with status [96]
Nov 18 15:19:14 pid:64  ERROR: Agent process [444] exited with status [96]
iychoi commented 1 week ago

Thank you! I'll take a look.