ReScience / submissions

ReScience C submissions
28 stars 7 forks source link

Ten years reproducibility challenge: paper #9 #22

Closed rdicosmo closed 4 years ago

rdicosmo commented 4 years ago

Original article: M. Danelutto, R. Di Cosmo, X. Leroy, and S. Pelagatti. “Parallel Functional Programming with Skeletons: the OCamlP3L experiment.” In: ACM Workshop on ML and its applications. ACM. Baltimore, United States, Sept. 1998. https://hal.archives-ouvertes.fr/hal-01499962v1

PDF URL: https://gitlab.inria.fr/dicosmo/ocamlp3l-rescience/-/blob/master/article.pdf Metadata URL: https://gitlab.inria.fr/dicosmo/ocamlp3l-rescience/-/blob/master/metadata.yaml Code URL: https://archive.softwareheritage.org/swh:1:rev:2db189928c94d62a3b4757b3eec68f0a4d4113f0;origin=https://gitorious.org/ocamlp3l/ocamlp3l_cvs.git/

Scientific domain: Parallel programming, HPC Programming language: OCaml Suggested editor:

rdicosmo commented 4 years ago

This is the submission corresponding to https://github.com/ReScience/ten-years/issues/1#issuecomment-542379915

khinsen commented 4 years ago

Thanks for this submission, I will start looking for a reviewer.

rougier commented 4 years ago

Gentle reminder

khinsen commented 4 years ago

@rougier I am contacting potential reviewers by mail, that's why there is no activity here.

rougier commented 4 years ago

ok

khinsen commented 4 years ago

Frédéric Gava has kindly accepted to review this paper.

khinsen commented 4 years ago

Here's Frédéric's report, sent by mail because he doesn't have a GitHub account yet.

Afin de montrer l'utilité d'une plateforme de sauvegarde de logiciels et donc ne de pas juste se contenter d'introduire dans ces articles des URLs sur les pages web des auteurs, URLs qui sont non fiables a plus ou moins long termes suivant l'évolution des carrières des chercheurs ou des institutions elles-mêmes, l'auteur propose de refaire des tests de performances (accélération d'une petite application) d'une bibliothèque de patrons algorithmiques, appelée OCamlP3l. Celle-ci a été codée par l'auteur lui-même ainsi que par d'autres chercheurs, avec et pour le langage OCaml, et a été décrite dans un article publié il y a 23 ans (!).

L'auteur note que retrouver les sources de la version de originel de l'article fut impossible. Et on peut supposer que ce serait aussi le cas pour le matériel utilisé à l'époque. Mais, grâce à une retro-compatibilité très forte du langage OCaml (et de ces bibliothèques, notamment pour les communications TCP/IP) et, on peut supposer, avec à la grande stabilité des outils de compilation (Makefile, etc.), il a été possible de refaire fonctionner un code OCamlP3l (avec une version un peu ultérieure à la version originel). Ceci montre la grande capacité de reproductibilité de cet environnement de programmation et donc de ses capacités de réplicabilité en supposant un accès au même matériel (ou via des VMs ?)

Il serait intéressant de comparer ce travail avec d'autre anciens projets OCaml car ici, il semble que c'est bien l'incroyable rétrocompatibilité de OCaml qui est démontrée (le rapporteur admet l'avoir testé sur ses anciens programmes d'il y a 20 ans et qui eux aussi refonctionnent sans problème). Il serait aussi intéressant de comparer ce travail avec d'autres environnements (avec le langage C par exemple) : comment recompiler des anciens projets qui utiliseraient plus d'appels systèmes (obsolètes) ou utilisant des bibliothèques qui ne seraient plus compatibles voir compilables. Et comment répliquer des tests de performances quand le matériel change: par exemple, avec des processeurs bien plus efficaces (plus de mémoires caches qu'il y a 20 ans), les performances d'un programme pour architectures distribuées pourraient être fortement impactées par des réseaux toujours aussi lents.

Pour finir, il faut noter qu'une plateforme de sauvegarde de logiciels qui associerait les papiers de recherche à des sources (et des versions de logiciels) seraient très bénéfique pour la reproductibilité des expériences. Il reste néanmoins le problème du matériel: est-il raisonnable de vouloir reproduire, 20 ans après, les performances d'un programme tournant sur une grappe de PCs de milliers de processeurs ? Est-ce faisable ? 

khinsen commented 4 years ago

Two comments on this report:

khinsen commented 4 years ago

English translation of the review by Frédéric Gava

In order to demonstrate the usefulness of a software backup platform and so don't just introduce in these articles URLs on the authors' web pages, URLs that are unreliable have no more or less long term depending on the evolution of the researchers' careers, or of the institutions themselves, the author proposes to re-test the performance (acceleration of a small application) of a library of algorithmic patterns, called OCamlP3l. It was encoded by the author himself as well as by other researchers, with and for the OCaml language, and was described in an article published 23 years ago (!).

The author notes that retrieving the sources of the original version of the article was impossible. And we can assume that this would also be the case for the equipment used at the time. But, thanks to a very strong backwards compatibility of the OCaml language (and of these libraries, especially for TCP/IP communications) and, one can assuming, with the high stability of the compilation tools (Makefile, etc.), it has been possible to rework a OCamlP3l (with a slightly later version than the original). This shows the high reproducibility of this environment. programming and therefore its replicability, assuming that it can be replicated access to the same hardware (or via VMs?)

It would be interesting to compare this work with other former OCaml projects because here it seems that it is indeed the unbelievable backwards compatibility of OCaml which is demonstrated (the rapporteur accepts that the tested it on his old programs from 20 years ago and they also work again without any problem). It would also be interesting to compare this work with other environments (with C by example): how to recompile old projects that would use more system calls (obsolete) or using libraries that do not support the would be more compatible or even compilable. And how to replicate performance tests when the hardware changes: for example, with much more efficient processors (more caches than there are 20 years), the performance of a program for distributed architectures could be heavily impacted by continued slow networks.

Finally, it should be noted that a software backup platform that would link research papers to sources (and versions of of software) would be very beneficial for the reproducibility of experiments. However, there is still the problem of the hardware: is it reasonable to want to reproduce, 20 years later, the performance of a program running on a cluster of PCs with thousands of processors? Is this feasible? 

Translated with www.DeepL.com/Translator (free version)

rdicosmo commented 4 years ago

Thanks @khinsen for getting this review. Delighted to see that the reviewer liked the results presented here, and that he had similar experience with OCaml programs of his owns. I do not see any changes to make to the submission for the moment.

khinsen commented 4 years ago

@rdicosmo I agree that this review does not suggest any changes to the manuscript. So... I declare this submission accepted!

khinsen commented 4 years ago

@rdicosmo You can find an updated metadata.yaml with all the required information for the final published PDF at https://gist.github.com/khinsen/c559f9ec32af7f268deb219813cd7d5e. Could you please use it to generate the PDF yourself? It doesn't work for me because it uses a LaTeX package that I cannot find (software-biblatex).

rdicosmo commented 4 years ago

Thanks a lot @khinsen !!! The recompiled paper is now available at https://gitlab.inria.fr/dicosmo/ocamlp3l-rescience/-/blob/master/article.pdf

khinsen commented 4 years ago

Thanks @rdicosmo !

The article is published (http://doi.org/10.5281/zenodo.3763416) and will soon appear on the ReScience Web site.

rdicosmo commented 4 years ago

That's great, thanks a lot for this great experience :-)