irods / irods

Open Source Data Management Software
https://irods.org
BSD 3-Clause "New" or "Revised" License
444 stars 141 forks source link

Data objects with same logical path but different Ids #6906

Open tedgin opened 1 year ago

tedgin commented 1 year ago

Bug Report

iRODS Version, OS and Version

iRODS 4.2.8, CentOS 7

What did you try to do?

I'm going to defer to @iychoi to explain how this happened, since he understands the steps better than I do. In brief, I think he did the following in rapid succession.

  1. Upload a file.
  2. Delete the newly created data object.
  3. Upload a file to the same location.

Our rule logic asynchronously replicates newly created data objects. I suspect that occasionally the replication of the data object created in step 1 would be in progress when he deleted the data object in step 2.

Expected behavior

I expect that when a data object is deleted while it is being replicated that it would cause the replication to be interrupted and the data object is entirely removed from the catalog.

Observed behavior (including steps to reproduce, if applicable)

We occasionally end up with two data objects in the catalog with the same logical path but different Ids. One data object has a replica in the replication resource, and the other has a replica in the ingest resource.

Here's an example.

ipc_admin@prod ~? isysmeta ls -l /iplant/home/user/DJI_0128.JPG
doing ls of /iplant/home/user/DJI_0128.JPG
data_name: DJI_0128.JPG
data_id: 915494550
coll_id: 915492016
data_repl_num: 0
data_version: 
data_type_name: generic
data_size: 8592622
resc_name: corral4
data_path: /corral/irods/iplant/Vault/home/user/DJI_0128.JPG
data_owner_name: user
data_owner_zone: iplant
data_repl_status: 1
data_status: 
data_checksum: 85482426389bca40f4dd29c219c47822
data_expiry_ts (expire time): 00000000000: None
data_map_id: 0
r_comment: 
create_ts: 01674865784: 2023-01-27.17:29:44
modify_ts: 01674865784: 2023-01-27.17:29:44
----
data_name: DJI_0128.JPG
data_id: 915492457
coll_id: 915492016
data_repl_num: 1
data_version: 
data_type_name: generic
data_size: 8592622
resc_name: hugo
data_path: /irods_vault/home/user/DJI_0128.JPG
data_owner_name: user
data_owner_zone: iplant
data_repl_status: 1
data_status: 
data_checksum: 85482426389bca40f4dd29c219c47822
data_expiry_ts (expire time): 00000000000: None
data_map_id: 0
r_comment: 
create_ts: 01674865077: 2023-01-27.17:17:57
modify_ts: 01674865077: 2023-01-27.17:17:57
trel commented 1 year ago

I think this might be completely solved/prevented by the logical locking introduced in 4.2.9.

iychoi commented 1 year ago

We will test this again when we upgrade the Data Store to 4.2.11.

It looks like the issue is related to database transactions. The issue disappeared when I close the database transactions after file deletion and creation.

trel commented 1 year ago

Ah, very good - a workaround until upgrade.