ExPaNDS-eu / ExPaNDS

The main repository for the ExPaNDS project sponsored by the EU
5 stars 6 forks source link

Reference Testbed for development of Data Pipelines #22

Closed servansod closed 1 year ago

servansod commented 3 years ago

Origin: Review Report PMOC-857641-1-RP1

Recommendation 7 (Reference Testbed for Development of Data Pipelines over PaN Facilities): Based on the work of WP4 the consortium should attempt to produce reusable assets that facilitate data scientists and domain experts to develop data analytics pipelines of FAIR data of the facility. In this direction the development of a reference testbed for data science that could serve as a blueprint for the community could be considered. Likewise, cookbooks, how-tos and relevant training materials could be developed as part of WP5.

marauskajul commented 2 years ago

Progress report (May 2022): Silvan is working on a reference test bed. Some use cases are already on the platform. There should not only be pipelines but also software infrastructure, training material and datasets. This topic will be discussed during the f2f meeting in Prague (14/05-15/05). Majid will speak to colleagues about use cases. Topic must be discussed with WP2 and WP3.

marauskajul commented 2 years ago

E-Mail from Silvan (28/09/2022): This item is still being worked on. @Majid: additional use cases would still be great, as up to know CrystFEL stays the only use case at hand.

marauskajul commented 1 year ago

E-mail (Silvan, 07/02/2023):

Hi Juliane, coming back on the issue #22, I guess it can be closed by now. Not sure how to comment on it, as I don't have a desy-connected github account. so I send it to you :) A VISA platform was established at DESY, which allows scientists to develop data analytic pipelines for open data or experimental data one has access to. Additionally one can run these pipelines via containers, to achieve reproducible results. The VISA installation serves as prototype and blueprint for other facilities, as it is currently being copied/deployed at SOLEIL and possibly ALBA. For easier container handling, a script is provided which automates the creation of wrapped container commands. Let me know if there is anything else needed, related to the issue. best, silvan

E-mail (Majid, 07/02/2023):

Hi Silvan, Juliane,

The VISA installation serves as prototype and blueprint for other facilities, as it is currently being copied/deployed at SOLEIL and possibly ALBA. currntly being deployed at SOLEIL and ALBA Regards, Majid

E-mail (Anton, 07/02/2023)

Hi Just adding some comment as to how I parse the issue #22: Recommendation 7 (Reference Testbed for Development of Data Pipelines over PaN Facilities): Based on the work of WP4 the consortium should attempt to produce reusable assets that facilitate data scientists and domain experts to develop data analytics pipelines of FAIR data of the facility. In this direction the development of a reference testbed for data science that could serve as a blueprint for the community could be considered. Likewise, cookbooks, how-tos and relevant training materials could be developed as part of WP5. Let’s map that into a more easily understood form: produce reusable assets VISA + container registry that facilitate data scientists and domain experts to develop data analytics pipelines of FAIR data of the facility That enable people to put software into portable containers of notebooks, ie: copy what you did with CrystFEL + Singularity In this direction the development of a reference testbed for data science that could serve as a blueprint for the community could be considered. VISA installation at DESY, which is being copied at SOLEIL and perhaps ALBA, is a prototype of this… Likewise, cookbooks, how-tos and relevant training materials could be developed as part of WP5. Instructions for ‘How to containerise your application to run in VISA like CrystFEL does’…. Pass on some documentation to WP5 describing how you did it / how it’s there in Singularity but the wrappers make it work easily so this is how people can fit their applications into what you’ve done with CrystFEL?

servansod commented 1 year ago

Final justification for this recommendation (taken from the periodic report):

Task 4.5, which was planned for the second period of ExPaNDS, is aligned with this recommendation, namely the testing and validation framework published by MAX IV (see section 1.2.4, task 4.5). In addition, training workflows were developed in PaN-training.eu to explain how to reproduce the reference data analysis services developed in WP4 (see section 1.2.4, task 4.7).