MehmedGIT / OtoN_Converter

Converter from basic OCRD process workflow to Nextflow workflow script
Apache License 2.0
4 stars 1 forks source link

Michelle's refactoring suggestions #1

Closed mweidling closed 2 years ago

mweidling commented 2 years ago

In this PR the following changes have been made:

Feel free to cherry-pick whatever seems right for you.

MehmedGIT commented 2 years ago

Hi Michelle! Thanks for your contribution. I went over the changes you suggested and found out a few things that could be further improved:

  1. In the config.toml, we cannot simply replace $HOME and $projectDir with .. The project directory from the viewpoint of Nextflow may not be always inside the home or current directory. Moreover, each Nextflow process gets its own separate work environment, independent of other processes. Thus, we need to provide proper environment variables and let Nextflow handle them afterward -> $projectDir. The same argument is valid for $HOME. Even if you replace the home environment variable with ~, Nextflow will complain that it cannot evaluate ~ to a path. I guess this is due to the implementation details of Nextflow.

I like the idea of having a separate configuration file to set the proper paths. However, we must clearly specify why $HOME and $projectDir have to stay. Moreover, it may be confusing for the user why we need an escape character before $HOME but not before $projectDir. The answer is provided here: https://www.nextflow.io/docs/latest/process.html#script.

Since Nextflow uses the same Bash syntax for variable substitutions in strings, you must manage them carefully depending on whether you want to evaluate a Nextflow variable or a Bash variable.

  1. The token error numbers should ideally match with the number of the specific rule that was broken - rule numbers given in the comments above the validator methods. We may also extend the rules or provide them in a separate file to keep things clean.

  2. I still do not understand how the imports are sorted. Seems they do not follow alphabetical order. Any ideas according to what the Linter sorted them?

Everything else seems fine. Please implement the suggested changes and I will merge your PR.

mweidling commented 2 years ago

Hi Michelle! Thanks for your contribution. I went over the changes you suggested and found out a few things that could be further improved:

1. In the `config.toml`, we cannot simply replace `$HOME` and `$projectDir` with `.`. The project directory from the viewpoint of Nextflow may not be always inside the home or current directory. Moreover, each Nextflow process gets its own separate work environment, independent of other processes. Thus, we need to provide proper environment variables and let Nextflow handle them afterward -> `$projectDir`. The same argument is valid for `$HOME`. Even if you replace the home environment variable with `~`, Nextflow will complain that it cannot evaluate `~` to a path. I guess this is due to the implementation details of Nextflow.

I like the idea of having a separate configuration file to set the proper paths. However, we must clearly specify why $HOME and $projectDir have to stay. Moreover, it may be confusing for the user why we need an escape character before $HOME but not before $projectDir. The answer is provided here: https://www.nextflow.io/docs/latest/process.html#script.

Since Nextflow uses the same Bash syntax for variable substitutions in strings, you must manage them carefully depending on whether you want to evaluate a Nextflow variable or a Bash variable.

2. The token error numbers should ideally match with the number of the specific rule that was broken - rule numbers given in the comments above the validator methods. We may also extend the rules or provide them in a separate file to keep things clean.

3. I still do not understand how the imports are sorted. Seems they do not follow alphabetical order. Any ideas according to what the Linter sorted them?

Everything else seems fine. Please implement the suggested changes and I will merge your PR.

ad 1) Sure, I can re-introduce the original settings to the TOML.

ad 2) I will adjust the numbering according to the rule numbers. I will also think about a better solution for the rules but this shouldn't impede this PR IMO.

ad 3) I do not understand them as well. :smile: I use Pylint for VS Code and its "Sort import" functionality. According to the docs it should sort them alphabetically, but well.