ltalirz / atomistic-software

Tracking citations of atomistic simulation engines
https://atomistic.software
GNU Affero General Public License v3.0
19 stars 12 forks source link

data - consider adding "public release" column for code #8

Closed ltalirz closed 2 years ago

ltalirz commented 3 years ago

to give an idea how long they've been around

ltalirz commented 3 years ago

The question of when a code was founded is often not trivial to answer.

One might be tempted to look at the date at which development started. Yet, many codes are not developed from scratch, thus making it difficult to pinpoint when exactly development on a previous project stopped and development of a new code started.

Alternatively, one may want to focus on the first public release of a code (which is more relevant from a user perspective and thus for atomistic.software). However, many codes existed in various stages of availability over the years.

Just as an illustrative example, here is an excerpt of the history of VASP

  • VASP is based on a program initially written by Mike Payne at the MIT. Hence, VASP has the same roots as the CASTEP/CETEP code, but branched from this root at a very early stage. At the time, the VASP development was started the name CASTEP was not yet established. The CASTEP version upon which VASP is based only supported local pseudopotentials and a Car-Parrinello type steepest descent algorithm.
  • July 1989: Jürgen Hafner brought the code to Vienna after half a year stay in Cambridge.
  • Sep. 1991: work on the VASP code was started. At this time, in fact, the CASTEP code, was already further developed, but VASP development was based on the old 1989 CASTEP version.
  • Oct. 1992: ultra-soft pseudopotentials were included in the code, the self-consistency loop was introduced to treat metals efficiently.
  • Jan 1993: J. Furthmüller joined the group. He wrote the first version of the Pulay/Broyden charge density mixer and contributed - among other things - the symmetry code, the INCAR-reader and a fast 3D-FFT.
  • Feb 1995: J. Furthmüller left Vienna. In the time due, VASP has got it's final name, and had become a stable and versatile tool for ab initio calculations.

VASP is referred to by name in https://doi.org/10.1103/PhysRevB.50.13181 (1994). This paper cites
https://doi.org/10.1103/PhysRevB.47.558 (1993) as a reference for VASP (but in that paper the package is not referred to by name). A more comprehensive review of the code is presented in https://doi.org/10.1007/978-1-4615-5943-6_10 (1997). None of these papers mention how and under which conditions an interested user would be able to obtain a copy of VASP but there are papers already in the 1990s from groups outside Vienna using VASP (e.g. https://doi.org/10.1016/S0009-2614(98)00569-7) while e.g. the www.vasp.at web site began operation only in 2012.

In conclusion, I propose that the date (year) to be recorded should be the date at which a code is made available to the (worldwide) public under clearly stated terms and conditions. For open-source codes this would be the day at which the code is made available publicly with a license.

This type of data is not always straightforward to discern from information available on the web and may make it necessary to reach out to code authors individually.

ltalirz commented 2 years ago

Given that this list focuses on trends in simulation software usage, I've come to the conclusion that it makes sense to abstain from adding metadata like a public release date that are not well defined.

If we feel at any point that we need more of a historical perspective, we can always extend the dataset to times before 2010 and judge how long a code has been "around" based on how it has been cited. Note, however, that this may require the addition of further codes to the list that were relevant during the time period in question, and that the query strings for codes may need to be adapted to cover citation practices of that time.