UtrechtUniversity / yoda

A system for reliable, long-term storing and archiving large amounts of research data during all stages of a study.
https://utrechtuniversity.github.io/yoda/
GNU General Public License v3.0
46 stars 26 forks source link

[FEATURE] Work in SURF Research Cloud (SRC) directly on your institutional Yoda data #353

Open Jos-London opened 9 months ago

Jos-London commented 9 months ago

Is your feature request related to a problem? Please describe.

Work in SURF Research Cloud (SRC) directly on your institutional Yoda data. In line with what is already possible with Research Drive.

Yoda is recommended to the researcher as storage environment during all stages of the research project, so also for the analysing stage.

Collaboration between the Yoda development group and the SRC group seems necessary to support this feature.

Describe the solution you'd like

If you use Surf Research cloud, you would want to work directly on your datasets within Yoda, i.e. read in data and write the results back.

Describe alternatives you've considered

Ultimately, it is not desirable that data has to be dragged to different environments.

Additional context

This feature has already been promised by the SRC group in the past, but seems to be stagnating.

This feature is highly desirable within Erasmus, to offer an alternative to local server storage and analysis capacity. The integration (connection) of Yoda storage environment with the SRC analysis environment could be a great total solution for various disciplines.

Danny-dK commented 9 months ago

@tsmeele You worked on the SURF research cloud(??). Do you have insight? At least on https://www.surf.nl/en/services/surf-research-cloud it states "Connections to iRODS instances at SURF will also be available soon. " and https://servicedesk.surf.nl/wiki/display/WIKI/Connect+iRODS.

tsmeele commented 9 months ago

It is possible to make Yoda (or any other iRODS system) available as a filesystem mount in SURF Research Cloud. This would be based on the WebDAV protocol just like the connectivity to ResearchDrive. In the past I have referenced a Yoda network disk URL where one would specify a ResearchDrive url in the ResearchCloud portal and got that going. This might still work.

Unfortunately yet understandable from a security perspective, nowadays a frequently changing "data access password" is required to access a Yoda disk, which clouds the Research Cloud user journey. Another downside is that in general data references over a LAN or WAN network can be factors slower than using attached storage when random access methods are used which slows down data analysis. Hence we recommend users to synchronize data from Yoda to a workspace and synchronize the resulting data back to Yoda again. For this purpose we have crafted a Research Cloud component that installs a tiny graphical application, a wrapper around the iRODS isync command. See https://utrechtuniversity.github.io/researchcloud-items/playbooks/irods-desktop.html for details. One might also implement more advanced data transfer solutions, for instance iBridges, see https://github.com/UtrechtUniversity/iBridges

tsmeele commented 9 months ago

@Danny-dk indeed earlier SURF has prototyped a solution that connects a user of a Research Cloud workspace to a SURF hosted iRODS (Yoda) instance. The prototype depended on a manually maintained database at SURF that relates an SRAM username to an iRODS connection path. Once this concept is fully automated it would provide for user-friendly connectivity. Architecturally, it is still a question how to support users that would wish to connect to a non-default iRODS system, e.g. when collaborating with other institutes.

Danny-dK commented 9 months ago

@ccacciari Do you have any insights on making Yoda directly available in SURF Research Cloud (the concept that Ton mentioned in his reply), any progress to mention? Is there a GitHub specifically for SURF Research Cloud for feature requests? I don't think the Yoda team can do anything in making Yoda accessible in the SURF application (would be on SURF side), so would be good if you have a platform for open issues / feature requests regarding SRC (would be better than the closed ticketing system at SURF).

ccacciari commented 8 months ago

@ccacciari Do you have any insights on making Yoda directly available in SURF Research Cloud (the concept that Ton mentioned in his reply), any progress to mention?

The iRODS back-end of Yoda is already accessible via command line or webdav mounted folder from any computing environment at SURF, including SURF Research Cloud. However, this requires manual configuration and some expertise. More in details the Yoda team at SURF, the SURF Research Cloud and the SRAM team are working on it, but it is a long term project, so I do not have dates yet. The idea is to mount an iRODS folder as a local folder on the SRC VM via webdav. The aim of the project is to automate it, as Ton mentioned, so that the user has the folder already mounted when the SRC catalog app starts. The tricky part, as usual, is the authentication and authorization aspect.

Is there a GitHub specifically for SURF Research Cloud for feature requests? I don't think the Yoda team can do anything in making Yoda accessible in the SURF application (would be on SURF side), so would be good if you have a platform for open issues / feature requests regarding SRC (would be better than the closed ticketing system at SURF).

No, SURF Research Cloud does not have it. We can, of course, report to our SRC colleagues any feedback, even if clearly this approach does not provide the level of interaction and transparency that you would like.

Danny-dK commented 8 months ago

@Jos-London does the answer of @ccacciari suffice for now?

Danny-dK commented 8 months ago

Not part of Yoda but part of SURF research cloud. Opt to close @lwesterhof or @RobvanSchip