irods / python-irodsclient

A Python API for iRODS
Other
63 stars 73 forks source link

Error while sending file multiple time #514

Closed sigau closed 8 months ago

sigau commented 8 months ago

Hello I'm trying to compare the efficiency of api with icommande for sending files.

To do this we have a script that uses the api such as :

def PUSH_SPEEDTEST(local_object,irods_path):
    ##test to speedtest the api version against the icommands 
    with iRODSSession(**irods_config) as session:
        temps_debut = time.time()
        session.data_objects.put(local_object,irods_path,num_threads=nb_threads)
        temps_fin = time.time()
        temps_total = temps_fin - temps_debut
    print(temps_total)

We then ran this function a hundred times on different files of different sizes (which we also sent to irods using icommands). It works, except that randomly we get this error that we can't understand and fix :

Traceback (most recent call last):
  File "/home/gdebaeck/Documents/easy_irods_commands/api_easicmd.py", line 1120, in <module>
    main()
  File "/home/gdebaeck/Documents/easy_irods_commands/api_easicmd.py", line 213, in main
    PUSH_SPEEDTEST(sys.argv[2],sys.argv[3])
  File "/home/gdebaeck/Documents/easy_irods_commands/api_easicmd.py", line 759, in PUSH_SPEEDTEST
    session.data_objects.put(local_object,irods_path,num_threads=nb_threads)
  File "/home/gdebaeck/Documents/easy_irods_commands/env_easicmd/lib/python3.10/site-packages/irods/manager/data_object_manager.py", line 151, in put
    if not self.parallel_put( local_path, (obj,o), total_bytes = sizelist[0], num_threads = num_threads,
  File "/home/gdebaeck/Documents/easy_irods_commands/env_easicmd/lib/python3.10/site-packages/irods/manager/data_object_manager.py", line 244, in parallel_put
    return parallel.io_main( self.sess, data_or_path_, parallel.Oper.PUT | (parallel.Oper.NONBLOCKING if async_ else 0), file_,
  File "/home/gdebaeck/Documents/easy_irods_commands/env_easicmd/lib/python3.10/site-packages/irods/parallel.py", line 482, in io_main
    retval = _io_multipart_threaded (Operation, (Data, Io), replica_token, resc_hier, session, fname, total_bytes,
  File "/home/gdebaeck/Documents/easy_irods_commands/env_easicmd/lib/python3.10/site-packages/irods/parallel.py", line 382, in _io_multipart_threaded
    Io = session.data_objects.open( Data_object.path, Operation.data_object_mode(initial_open = False),
  File "/home/gdebaeck/Documents/easy_irods_commands/env_easicmd/lib/python3.10/site-packages/irods/manager/data_object_manager.py", line 340, in open
    desc = conn.recv().int_info
  File "/home/gdebaeck/Documents/easy_irods_commands/env_easicmd/lib/python3.10/site-packages/irods/connection.py", line 133, in recv
    raise get_exception_by_code(msg.int_info, err_msg)
  File "/home/gdebaeck/Documents/easy_irods_commands/env_easicmd/lib/python3.10/site-packages/irods/exception.py", line 171, in get_exception_by_code
    exc_class = iRODSExceptionMeta.codes[ rounded_code( code ) ]
KeyError: -40500

Can you help us understand where this error comes from and how to fix it? Thank you in advance

Gautier

alanking commented 8 months ago

Hi Gautier,

The error code -40500 is INTERMEDIATE_REPLICA_ACCESS. This means that a replica exists and is opened for write by an iRDOS agent process and another iRODS agent process has tried and failed to open the same replica.

Are you able to determine which data object is returning this error when running the put? If so, please confirm the status of the replicas of that data object. If the object is locked, do you know whether it is stuck that way, or is another iRODS agent still writing to it?

If an iRODS agent is still writing to the object at the time this happens, things are working as designed. Uncoordinated, concurrent writes to the same data object are not allowed. Make sure you are not attempting to put to the same logical path at the same time from different clients.

If no iRODS agent is writing to the data object, the object is considered "stuck" and this is a bug. Please file an issue on GitHub with steps to reproduce the issue and see the following section in the docs for instructions on how to remedy the situation: https://docs.irods.org/4.3.1/system_overview/troubleshooting/#data-object-stuck-in-locked-or-intermediate-status

On Mon, Feb 5, 2024 at 10:29 AM Debaecker Gautier @.***> wrote:

Hello I'm trying to compare the efficiency of api with icommande for sending files.

To do this we have a script that uses the api such as :

def PUSH_SPEEDTEST(local_object,irods_path):

test to speedtest the api version against the icommands

with iRODSSession(**irods_config) as session:
    temps_debut = time.time()
    session.data_objects.put(local_object,irods_path,num_threads=nb_threads)
    temps_fin = time.time()
    temps_total = temps_fin - temps_debut
print(temps_total)

We then ran this function a hundred times on different files of different sizes (which we also sent to irods using icommands). It works, except that randomly we get this error that we can't understand and fix :

Traceback (most recent call last): File "/home/gdebaeck/Documents/easy_irods_commands/api_easicmd.py", line 1120, in main() File "/home/gdebaeck/Documents/easy_irods_commands/api_easicmd.py", line 213, in main PUSH_SPEEDTEST(sys.argv[2],sys.argv[3]) File "/home/gdebaeck/Documents/easy_irods_commands/api_easicmd.py", line 759, in PUSH_SPEEDTEST session.data_objects.put(local_object,irods_path,num_threads=nb_threads) File "/home/gdebaeck/Documents/easy_irods_commands/env_easicmd/lib/python3.10/site-packages/irods/manager/data_object_manager.py", line 151, in put if not self.parallel_put( local_path, (obj,o), total_bytes = sizelist[0], num_threads = num_threads, File "/home/gdebaeck/Documents/easy_irods_commands/env_easicmd/lib/python3.10/site-packages/irods/manager/data_object_manager.py", line 244, in parallel_put return parallel.io_main( self.sess, data_orpath, parallel.Oper.PUT | (parallel.Oper.NONBLOCKING if async else 0), file, File "/home/gdebaeck/Documents/easy_irods_commands/env_easicmd/lib/python3.10/site-packages/irods/parallel.py", line 482, in io_main retval = _io_multipart_threaded (Operation, (Data, Io), replica_token, resc_hier, session, fname, total_bytes, File "/home/gdebaeck/Documents/easy_irods_commands/env_easicmd/lib/python3.10/site-packages/irods/parallel.py", line 382, in _io_multipart_threaded Io = session.data_objects.open( Data_object.path, Operation.data_object_mode(initial_open = False), File "/home/gdebaeck/Documents/easy_irods_commands/env_easicmd/lib/python3.10/site-packages/irods/manager/data_object_manager.py", line 340, in open desc = conn.recv().int_info File "/home/gdebaeck/Documents/easy_irods_commands/env_easicmd/lib/python3.10/site-packages/irods/connection.py", line 133, in recv raise get_exception_by_code(msg.int_info, err_msg) File "/home/gdebaeck/Documents/easy_irods_commands/env_easicmd/lib/python3.10/site-packages/irods/exception.py", line 171, in get_exception_by_code exc_class = iRODSExceptionMeta.codes[ rounded_code( code ) ] KeyError: -40500

Can you help us understand where this error comes from and how to fix it? Thank you in advance

Gautier

— Reply to this email directly, view it on GitHub https://github.com/irods/python-irodsclient/issues/514, or unsubscribe https://github.com/notifications/unsubscribe-auth/AB5GI3CJOZOMYSJI3K5AKILYSD3GFAVCNFSM6AAAAABC2LDHGOVHI2DSMVQWIX3LMV43ASLTON2WKOZSGEYTQOBTGM3DENQ . You are receiving this because you are subscribed to this thread.Message ID: @.***>

-- Alan King Senior Software Developer | iRODS Consortium

d-w-moore commented 8 months ago

@sigau Is it possible this script is being run from multiple client sessions simultaneously? That is the only way I could see one agent's access getting in the way of another.

sigau commented 8 months ago

I was thinking something like that because we run the script 100 time in a while loop but as it's a while loop it should only begin the next iteration when the previous one is finish. I think some object were blocked and when we re-run the loop it failed at some of them, but this doesn't happen anymore when we take care that each object have an unique name. Thanks for your help

korydraughn commented 8 months ago

@sigau Seems you're in a good spot now. Shall we close this?

sigau commented 8 months ago

yes sorry

korydraughn commented 8 months ago

All good. Thanks.