datalad / datalad-gooey

A graphical user interface for DataLad (datalad.org)
https://docs.datalad.org/projects/gooey
Other
4 stars 6 forks source link

EnsureDatasetSiblingName is too expensive #365

Closed bpoldrack closed 1 year ago

bpoldrack commented 1 year ago

If a dataset has a bunch of git remotes, then querying datalad siblings is too expensive for widget generation. With a couple of failed attempts to set up gin siblings, I get this:

WARNING] Could not detect whether gin3 carries an annex. If gin3 is a pure Git remote, this is expected.  
[WARNING] Could not detect whether gin6 carries an annex. If gin6 is a pure Git remote, this is expected.  
[WARNING] Could not detect whether gin10 carries an annex. If gin10 is a pure Git remote, this is expected.  
[WARNING] Could not detect whether gintest3 carries an annex. If gintest3 is a pure Git remote, this is expected.  
[WARNING] Could not detect whether gin carries an annex. If gin is a pure Git remote, this is expected.  
[WARNING] Could not detect whether gin2 carries an annex. If gin2 is a pure Git remote, this is expected.  
[WARNING] Could not detect whether gin3 carries an annex. If gin3 is a pure Git remote, this is expected.  
[WARNING] Could not detect whether gin6 carries an annex. If gin6 is a pure Git remote, this is expected.  
[WARNING] Could not detect whether gin10 carries an annex. If gin10 is a pure Git remote, this is expected.  
[WARNING] Could not detect whether gintest3 carries an annex. If gintest3 is a pure Git remote, this is expected.  
[WARNING] Could not detect whether gin carries an annex. If gin is a pure Git remote, this is expected.  
[WARNING] Could not detect whether gin2 carries an annex. If gin2 is a pure Git remote, this is expected.  
[WARNING] Could not detect whether gin3 carries an annex. If gin3 is a pure Git remote, this is expected.  
[WARNING] Could not detect whether gin6 carries an annex. If gin6 is a pure Git remote, this is expected.  
...

This takes minutes, all the while the widget can't be built because of this not returning, making the GUI appear to be hanging. But all we need is the names, not any further information.

bpoldrack commented 1 year ago

Ideally that is solved in core or next to have a fast query for siblings w/o going through *Repo.

mih commented 1 year ago

Same as https://github.com/datalad/datalad-gooey/issues/342

mih commented 1 year ago

FTR: All the implementation is running is

ds.siblings(action='query', return_type='generator')

maybe it should be running with get_annex_info=False