List all bigger as well as minor issues to address before a next release. Most issues are related to rpbp or indirectly to pbiotools , but some pbiotools-specific issues are also listed, so they are all in one place.
[X] Removed "unused" scripts, e.g. notebooks and some analysis scripts.
[X] rpbp doesn't really handle overlapping ORFs. The filtered predictions are such that the longest ORF at each stop codon is selected (but based on start, and not on sum of exon lengths?), then only the ORF with the highest Bayes factor is picked among overlapping ORFs. We noticed in the past that the distinction between "filtered" and "unfiltered" ORFs was causing confusion, and also increases the number of output files. In the future, we should only write "filtered" ORFs by default, unless instructed otherwise (add option to write "unfiltered" ORFs). For the regression tests, run with the flag.
Minor/other issues
[X] pbiotools.misc.suppress_stdout_stderr does not seem to do it's job... Stan verbose output is not redirected to /dev/null anymore...
[X] When re-installing locally (I tested with editable mode) pip install -e ., the Stan models were re-compiled. Unless we specify it, they should not be recompiled. There is currently an option in the setup.py, but it hasn't been thoroughly tested.
Re-compilation was prompted by copying the model files. Anyway, this is irrelevant if we move to toml as config format, and compile model separately (pre-compile for conda), see #135
[X] pbiotools There are some SettingWithCopyWarning when running pbiotools pytest tests/.
[X] seed and chains are defined twice but with the same keywords. Handling of selected options e.g.smoothing_fraction in file name and actual value (from default and config).
Improvements
[ ] External programs (STAR, Flexbar) option handling needs to be improved... e.g. if --use-slurm, we need to escape options when passing the arguments --star-options \"--genomeSAindexNbases 10\", but without --use-slurm, this does not work. It largely depends on the shell, as it may strips the quotes, and the behaviour is expected to be different when running run-rpbp-pipeline, as it also pre-processes the option string... (see pbiotools pgrm_utils.py).
We can keep this for a patch release at a later stage, I don't think this is critical.
[X] Is [utils.create_symlink]: file already exists at subject to args.overwrite? We should leave the links if they already exist! In particular, check the behaviour of this flag for the profile construction (create-base-genome-profile).
CI/release
[X] Best practice and tools. Also add matrix of tests to rpbp. When we're all done, bump to next release -> Conda/PyPI.
Mostly done, preparing for the next release.
[X] Pysam 0.20.0 is now on conda. Update environment.
[x] There are some warning during rpbp and pbiotools installation, e.g.EasyInstallDeprecationWarning easy_install command is deprecated. Use build and pip and other standards-based tools. or SetuptoolsDeprecationWarning: setup.py install is deprecated. Use build and pip and other standards-based tools.
See #135 but we still need to change rpbp.
[ ] Update documentation on RTD. Use rtf instead of md? Is there a simpler way than to build documentation using sphinx? Or can it be integrated (config/hook)? Also update README.md.
[ ] We get this when using samtools (but this does not seem to affect the results...)
samtools: /beegfs/prj/rpbp-dev/bullseye/envs/rpbp-cmdstan/bin/../lib/libtinfow.so.6: no version information available (required by samtools)
samtools: /beegfs/prj/rpbp-dev/bullseye/envs/rpbp-cmdstan/bin/../lib/libncursesw.so.6: no version information available (required by samtools)
samtools: /beegfs/prj/rpbp-dev/bullseye/envs/rpbp-cmdstan/bin/../lib/libncursesw.so.6: no version information available (required by samtools)
The samtools warning are annoying, but I don't know if we can do anything...
Currently working on docs and next release (WIP). The rest is minor, and can probably wait tor some patch release.
List all bigger as well as minor issues to address before a next release. Most issues are related to rpbp or indirectly to pbiotools , but some pbiotools-specific issues are also listed, so they are all in one place.
Dashboards
Change existing functionalities
[X] Removed "unused" scripts, e.g. notebooks and some analysis scripts.
[X] rpbp doesn't really handle overlapping ORFs. The filtered predictions are such that the longest ORF at each stop codon is selected (but based on start, and not on sum of exon lengths?), then only the ORF with the highest Bayes factor is picked among overlapping ORFs. We noticed in the past that the distinction between "filtered" and "unfiltered" ORFs was causing confusion, and also increases the number of output files. In the future, we should only write "filtered" ORFs by default, unless instructed otherwise (add option to write "unfiltered" ORFs). For the regression tests, run with the flag.
Minor/other issues
[X]
pbiotools.misc.suppress_stdout_stderr
does not seem to do it's job... Stan verbose output is not redirected to /dev/null anymore...[X] When re-installing locally (I tested with editable mode)
pip install -e .
, the Stan models were re-compiled. Unless we specify it, they should not be recompiled. There is currently an option in the setup.py, but it hasn't been thoroughly tested.Re-compilation was prompted by copying the model files. Anyway, this is irrelevant if we move to toml as config format, and compile model separately (pre-compile for conda), see #135
[X] pbiotools There are some
SettingWithCopyWarning
when runningpbiotools pytest tests/
.[X]
seed
andchains
are defined twice but with the same keywords. Handling of selected options e.g.smoothing_fraction
in file name and actual value (from default and config).Improvements
--use-slurm
, we need to escape options when passing the arguments --star-options \"--genomeSAindexNbases 10\", but without--use-slurm
, this does not work. It largely depends on the shell, as it may strips the quotes, and the behaviour is expected to be different when runningrun-rpbp-pipeline
, as it also pre-processes the option string... (see pbiotools pgrm_utils.py).We can keep this for a patch release at a later stage, I don't think this is critical.
[utils.create_symlink]: file already exists at
subject toargs.overwrite
? We should leave the links if they already exist! In particular, check the behaviour of this flag for the profile construction (create-base-genome-profile
).CI/release
Mostly done, preparing for the next release.
[X] Pysam 0.20.0 is now on conda. Update environment.
[x] There are some warning during rpbp and pbiotools installation, e.g.
EasyInstallDeprecationWarning easy_install command is deprecated. Use build and pip and other standards-based tools.
orSetuptoolsDeprecationWarning: setup.py install is deprecated. Use build and pip and other standards-based tools.
See #135 but we still need to change rpbp.
See #146
Conda environment
What can we do?