mozillascience / software-citation-tools

https://mozillascience.github.io/software-citation-tools/
MIT License
23 stars 10 forks source link

What is the post-creation life cycle of software produced for your research? #6

Open faokryn opened 8 years ago

faokryn commented 8 years ago

Please describe the typical life cycle of software produced for your research, after creation and use. Is availability to and reusability by other researchers considered? How was the software made available to others, i.e. how was it hosted? If such software has been used by other researchers, what are some of the ways they've cited it, if at all?

More Discussion Questions

aurelg commented 8 years ago

From my own experience in bioinformatics, here are the most common situations:

Imagine you are a PhD student and write a successful program, mostly alone: you publish the associate scientific results. After publication, you may either keep your program for yourself (or whoever has IP), or be free to distribute it.

Anyway, if you make it available, you’ll soon have to answer requests for feature A or B, and bugs Y or X. The first will slow down your own work, and the latter might reveal flaws in your research and impact your career. The more successful you are, the more exposed you are, and the more vulnerable you are. This is kinda curse, especially if you’re not promoted quickly.

One day, you are eventually paid to work on another project. If your code is not opensource with an active community able to take over, then you'll have to choose: either you keep maintaining it or you do whatever you are paid for, or you do both (plus a burn-out). This is just not sustainable, and I’ve seen so many renowned software that are just abandoned and/or crippled of bugs no one knows how to fix.

IMHO, this situation comes from how research is organized. Group leaders are promoted because of their scientific abilities. This focus on scientific results let most PhD students and even postdocs think that quick & dirty methods are good enough. There is consequently no incentive to build a culture of sustainable software, explains the lack of support for scientific software development, and why scientific software are so broken.

Hopefully, as far as persona/lab branding/citations are concerned, the best strategy seems to be to publish the scientific results first, and then a methodological paper comes for free, and finally another article to advertise for the website/webservice once it's available. If the website is widely used, you can then write an article every other year. This is by far the most rewarding strategy in the long run in terms of publications, and can even allow for sustainable funding of the software.

If this strategy can’t be implemented - or in combination with it anyway - then the safest is certainly to release your software with a permissive (ie. free) license, and hope someone will take over when you’ll leave for other horizons.

Well, that was the quick (!) answer. More to come if you're interested :-)

Ourobor commented 8 years ago

@aurelg I had no idea academia was so cutthroat! I am still just an undergrad and I guess my picture of graduate work is a bit flawed. I do understand the fear of releasing flawed code though as it plagued me for quite a while in my first few years of undergrad.

One of the the things I was thinking about when I was reading the paper we used as inspiration for the project is the idea that citing software created during research allows it to be included in the peer review process. Would a focus on getting research code peer-reviewed and "legitimatized" be a good way to encourage code to be published?

Alternatively, the goal of Software Citation Tools is to cite software. We're really looking to collect metadata for the software being used and make it available. Perhaps adding an option to cite the software, with it's metadata including authors, purpose, etc, without linking to the source code would be useful?

aurelg commented 7 years ago

I think it's a good idea to promote software citations. Most citations however don't refer to the source code itself - which may not be available -, but to a reference paper which describes the first historical algorithm. There are usually some variations:

I guess a nice way to promote software citation would be to create a scientific software repository with ready-to-use docker images. A bit like this initiative. Then, citing software would be easy and safe. However, I don't think it can be achieved, as research software are usually bound to solving scientific issues that drive their evolution much more than the need to publish it and/org achieve reproducibility.

@Ourobor I'm not talking about academia as a whole, only my experience. YMMV (and I hope so).