Bioconductor / ExperimentHub

Client to access ExperimentHub resources
https://bioconductor.org/packages/ExperimentHub
9 stars 13 forks source link

Connection attempt while 'localHub=TRUE' leads to lengthy timeout on offline systems #41

Closed votti closed 3 months ago

votti commented 4 months ago

Background I am forced to work on a secure offline system due to data protection reasons.

Issue Every time an ExperimentHub instance is instanciated in an offline system without internet connection, a ca 1 minute delay occurred as the Hub tries to connect to the online resource, even if localHub=TRUE is passed.

Expected behaviour If the user indicates that the hub is used offline (localhub=TRUE) no attempt to connect to the online resource should be made.

Root cause I noticed that the current implementation of ExperimentHub (and AnnotationHub) are performing a connection test using readBin upon initialisation even if localHub==TRUE:

https://github.com/Bioconductor/ExperimentHub/blob/f758190e13ec31952d2b2b18fdd4261f5652e24d/R/ExperimentHub-class.R#L29-L35

On our system this fails with a timeout of ca 1 minute, thus inducing a 1 minute delay everytime an ExperimentHub instance is initialized. This is particular problematic as certain packages (eg DMRcate) are very frequently re-initializing the Hub classes instead of reusing them.

Workaround The readBin used for the connection test respects the timeout options(timeout = 1).

This has two issues:

Proposed solutions 1) Refactor the ExperimentHub initialization to not perform the connection attempt if localHub==TRUE

2) use a connection test function that respects a timeout that is very small (milliseconds).

votti commented 4 months ago

If you agree that this is an issue and the fix 1) would be reasonable, I would be happy to create a PR here and also @AnnotationHub where the issue exists as well.

lshep commented 4 months ago

Agreed. It should not test connection if localhub=TRUE. PR are great otherwise I can look at making this change later this week.