ourresearch / depsy

Track the impact of research software.
http://depsy.org
MIT License
190 stars 11 forks source link

the definition of research software in depsy is poor #98

Open mmckerns opened 7 years ago

mmckerns commented 7 years ago

All of the packages that I'm a primary author or a major contributing author are classified by depsy as not research software: https://pypi.python.org/pypi/dill https://pypi.python.org/pypi/klepto https://pypi.python.org/pypi/pox https://pypi.python.org/pypi/ppft

These packages are classified as non-research software by depsy, and don't have any stats: https://pypi.python.org/pypi/mystic https://pypi.python.org/pypi/pathos https://pypi.python.org/pypi/pygrace https://pypi.python.org/pypi/pyIDL https://pypi.python.org/pypi/pyina

And these aren't even indexed by depsy: https://pypi.python.org/pypi/multiprocess

In addition, there are no contributors listed for any of the packages above, except for the first group of four packages (dill, klepto, pox, and ppft) -- and the authorship of ppft is incorrect.

I'm also listed as two people: http://depsy.org/person/354677 and http://depsy.org/person/417820, the first of which have some minor contributions, and the second to dill, klepto, and pox.

The metadata for the packages are standard, and all look similar to this:

Development Status :: 5 - Production/Stable
Intended Audience :: Developers
Intended Audience :: Science/Research
License :: OSI Approved :: BSD License
Topic :: Scientific/Engineering
Topic :: Software Development
...

Most of the packages have been released for roughy a decade, with fairly regular releases. These packages not only support my research, but have been supporting the research of others for several years -- some to many of the above packages have download counts in the multiple thousands. I can see from other download counters that the download stats on depsy are wrong (e.g. especially for the cases where the downloads are listed as zero). Several of these packages have publications written about them, as well as are cited in research publications (those that are mine, and moreover those that are not mine). I don't see anything about these packages that makes them non-standard, or would cause them to be labeled as non-research.

I'm not sure exactly why depsy gets it so wrong, even after looking at the code. I understand depsy is a new research project, is an early iteration, and has bugs in the logic. I wouldn't care about this project, aside from the fact that some people are using it as an impact factor for researchers who code... and if it's so wrong about the packages I support, I can't imagine that depsy provides anything but a misleading story, and can potentially do more damage than the good it hopes to do.