A list of minor/low-priority issues labeled as "enhancement". We need to re-visit this list.

Documentation

[X] We need to brush up the documentation. Maybe it would be a good idea in the long term to switch to RST.

I'm working on this now.

Installation

[X] Create a separate folder from the models from different versions of python: there seems to be some issues unpickling Stan models using different versions of python. So it would make sense to make the model path something like: $BASE/rpbp_models/python-<version>/...

Moved to CmdStanPy, no pickling anymore, models are installed/compiled under the conda environment by default.

[X] Add setup option to force recompilation of Stan models: by default, if the stan pickle models already exist, they are not recompiled. This can sometimes cause a problem due to changing versions of pystan and backwards compatibility issues.

This is not entirely resolved, listed in #133

Visualisation

Reporting/downstream analyses done via Dash.

[X] ORF visualization: add additional genome browser tracks such as:
1. Coverage profile of RNA-seq data (bedgraph https://genome.ucsc.edu/goldenpath/help/bedgraph.html -> bigWig)
2. Coverage profile of RiboSeq data (all & periodic)
3. P-site profile
Adding the bam files to IGV is not so helpful because they include the entire reads and are not shifted to account for P-site offsets. Brief online searching suggests the best approach is probably to first convert the P-site bed object to wiggle, then the wiggle to bigWig.
[X] Replicate correlation plots: add correlation plots of RPMs (or some other normalised value) between replicates after corrected assignment on codon and maybe on nucleotide level (see replicate ORF profiles).
[X] Handle all levels of "sample" specification in get-all-orf-peptide-matches:

The script is hard-coded to work with "cell-types" from the config file. It would be nice if it also handled samples (riboseq_samples" key) and conditions ("riboseq_biological_replicates" key).
1. [ ] Add a command line option to the script to specify the level
2. [ ] Add a function to ribo_utils.py which returns a list of the appropriate names
3. [ ] Use this function rather than the call to ribo_utils.get_riboseq_cell_type_samples
4. [ ] Add a function to riboutils.py which returns the appropriate "peptide_analysis" dictionary
5. [ ] Use that in the loop
This will also entail finding the correct filename based on the level (e.g., "sample" filenames include lengths and offsets, while the others do not; the locations are different).
[X] Create proteomics results plots: add notebooks and plots to the peptide report which show the proteomics results.
1. [x] Venn diagram of detected peptide sequences with given PEP threshold
2. [ ] Add detected peptides overlap to proteomics-report
3. [x] Venn diagram of in silico digested proteins
4. [ ] Add possible peptides to proteomics-report
5. [ ] Match (identified) peptides to transcripts
6. [ ] Add matched transcripts to proteomics-report

dieterich-lab / rp-bp

General enhancements #88

Documentation

Installation

Visualisation