hodcroftlab / covariants

Real-time updates and information about key SARS-CoV-2 variants, plus the scripts that generate this information.
https://covariants.org/
GNU Affero General Public License v3.0
317 stars 113 forks source link

Consistent python style, linting and formatting for `scripts` folder #245

Open maltekuehl opened 2 years ago

maltekuehl commented 2 years ago

Currently the python scripts in the ./scripts/ folder do not seem to strictly follow any linting, formatting, typing or style guide, docstrings are often missing and no configuration files to ensure code style consistency across IDEs seem to be present.

To enable consistent collaboration, it might perhaps be of interest to setup:

emmahodcroft commented 2 years ago

Hi @maltekuehl thank you for this suggestion! I'm afraid I'm not very strongly versed in linting. I generally code with Visual Studio Code, if there's something specific you'd recommend for helping to do this through that. Help would be welcomed to address this, as for me it is likely low on the priority list with the very limited resources we have! :)

maltekuehl commented 2 years ago

Thanks for the comment, @emmahodcroft! With this issue I just wanted to check if there's interest in this or if anyone's already working on it. I originally took a look at the scripts folder in the context of #243 as I wanted to see what it would take to integrate the JHU/OWID prevalence data in the JSON data. Since I didn't want to randomly mess around with the code without a proper structure or introduce unwanted linting/formatting, I thought creating this issue might be a good first step :)

Now if you say that help would be welcomed, I might take a look at this and implement some basic linting and formatting over the Christmas days if I manage to have some time off and hopefully this will facilitate future additions to the python code.

ivan-aksamentov commented 2 years ago

Hi @maltekuehl,

Thanks for the suggestion. As you probably know, due to huge workload, academic researchers are not typically willing to spend time on learning and using the best practices, even for tools they use routinely, because there are more pressing activities they need to pursue. As an engineer, I'd be happy if the code was formatted and linted, but the most important thing is to not make obstacles for Emma to produce the results quickly, because they are so much needed during pandemic. In other words, linting is okay, as long as things gets done. Docs are probably is a complete waste of time, because this is not a library, the code changes very fast, and nobody ever reads it except Emma. Tests will be almost impossible, because Emma will not write them for the new code and the old ones will get obsolete the moment they are written.

There's been attempts previously, like this: https://github.com/hodcroftlab/covariants/pull/131

But important things to note is that

Happy to hear your thoughts and to consider contributions!