galaxyproject / galaxy

Data intensive science for everyone.
https://galaxyproject.org
Other
1.4k stars 1.01k forks source link

Dataset not visible in a user's history (compressed/uncompressed, same hid) #17220

Closed mira-miracoli closed 2 months ago

mira-miracoli commented 10 months ago

Describe the bug We got a support request from a user, that ran out of storage and can not delete 2 large datasets, because they are not shown in the history (independent of the filter combinations). In the storage manager, the datasets are listed and you can also find them in User→Datasets, but when you then click on view in history the datasets will not appear (filters: deleted:any visible:any hid:61). However, what appears is the compressed version, which has the same hid. When I looked the datasets up in our database using

gxadmin query q "select dataset_id, d.uuid, name, hid, visible, d.deleted, d.purged, d.file_size  \
from history_dataset_association inner join dataset d on dataset_id = d.id where history_id = xxxxx and d.purged = 'f'"

The datasets showed up as follows:

dataset_id |                                name                                | hid | visible | deleted | purged |  file_size        
------------+--------------------------------------------------------------------+-----+---------+---------+--------+--------------
...
  xxxxxxx55 | Concatenate datasets on data 53, data 57, and data 55 uncompressed |  61 | f       | f       | f      | 166396579020
  xxxxxxx58 | Concatenate datasets on data 63, data 65, and data 64 uncompressed |  66 | f       | f       | f      | 166396579020

Galaxy Version and/or server at which you observed the bug Galaxy Version: 23.1_europe Commit: 5af8ba0f4c2df9f1fdac11acc3f282023322690c

To Reproduce Steps to reproduce the behavior: Not sure how The user reported that two jobs were running and only one finished and the other one was paused due to using ~200% of disk quota. Galaxy shows the following message in the paused job:

Execution of this dataset's job is paused because you were over your disk quota at the time it was ready to run

Expected behavior Dataset shown in the history and can be deleted and purged; the compressed and the uncompressed version is visible and the uncompressed version can be deleted. Screenshots

Screenshot from 2023-12-20 09-06-26-censored Screenshot from 2023-12-20 09-05-27-censored Screenshot from 2023-12-20 16-18-21

Screenshot from 2023-12-20 16-18-39

mvdbeek commented 6 months ago

@ahmedhamidawan I think you fixed this in https://github.com/galaxyproject/galaxy/pull/17648/commits/84b627260f7d6cff19262543be8723921abf2b4a

ahmedhamidawan commented 6 months ago

@ahmedhamidawan I think you fixed this in https://github.com/galaxyproject/galaxy/pull/17648/commits/84b627260f7d6cff19262543be8723921abf2b4a

Yes @mvdbeek , thank you!

mvdbeek commented 3 months ago

Hmm, guess that wasn't all of it, here's a history with duplicate hids from an implicit conversion: https://usegalaxy.org/u/marius/h/copy-of-human22chrsnps

Compare UI:

Screenshot 2024-08-02 at 19 22 13

with API https://usegalaxy.org/api/histories/c845ae1a2747ea06/contents

ahmedhamidawan commented 3 months ago

Hmm, guess that wasn't all of it, here's a history with duplicate hids from an implicit conversion: https://usegalaxy.org/u/marius/h/copy-of-human22chrsnps

For that history for e.g., I was able to come up with a solution where we add a item.sub_items property to items in the historyItemsStore, which allows us to do something like this:

https://github.com/user-attachments/assets/7c3f9d8f-8e4c-4f7c-ac09-851bdc9ff388

I will open a PR tomorrow for this

mvdbeek commented 3 months ago

This looks cool, but the original dataset is the one you'd want to see by default

martenson commented 3 months ago

How does it show in the toolform dataset input?

ahmedhamidawan commented 3 months ago

How does it show in the toolform dataset input?

Oh that would show the "original" dataset, which is set based on what the api returns for the current filter...

mira-miracoli commented 2 months ago

Thank you for fixing this @ahmedhamidawan ❤️