zavolanlab / bindz-rbp

RBP module for bindz, a bioinformatics tool to detect regulators' binding sites on RNA sequences.
https://github.com/zavolanlab/bindz-rbp
Apache License 2.0
6 stars 1 forks source link

Plot sequence logos #25

Closed krish8484 closed 4 years ago

krish8484 commented 4 years ago

Description

Please include a summary of the change and which issue is fixed. Please also include relevant motivation and context. List any dependencies that are required for this change. This PR aims to plot the sequence logos using python's logomaker which may be used along with the heatmaps in the future. Fixes #23

Type of change

Please delete options that are not relevant.

Checklist:

krish8484 commented 4 years ago

Are the sequence logos as expected? They are in output/sequence_logos.

AngryMaciek commented 4 years ago

Are the sequence logos as expected? They are in output/sequence_logos.

Dear Krish,

There is no directory called output anywhere I see, especially not uner the root of the repository... However, I have found some sample plots for HNRNPF undet unit test directory for this script so I will refer to these. It's OK but I think of 3 changes:

krish8484 commented 4 years ago

Hi Maciek, The directory output/sequence_logos will be in tests/integration after you run the snakefile, though the png files in unit tests depict a similar output.

AngryMaciek commented 4 years ago
AngryMaciek commented 4 years ago

If logomaker does not support plotting information content out-of-the-box, could you please calculate the contents youself? According to these specifications: https://en.wikipedia.org/wiki/Position_weight_matrix#Information_content We do not need to deal with background probabilities, the base of log is 2.

AngryMaciek commented 4 years ago

As I was goung through the Files changed section here on GitHub and reviewing the code I noticed that under the dedicated directory for unit-testing this script you have also added the png sample output files to the repository (ex: tests/unit/plot_sequence_logos/motif_HNRNPF_820.png). Could you please remind me - do we need them in the repository? It just flew through my head that maybe it is not necessary to include them, since every unit test will generate them once again, right?

If that is correct - please remove the sample output files from unit testing of this script as well as previous unit tests: combine_results, Plot-heatmap-for-motifs,

krish8484 commented 4 years ago

I have fixed the indentation in the Snakefile at the places you mentioned, removed the top and the right axes from the graph, changed the T to U in the input file in the unit tests, and changed the script to display information instead of probability. Deleted the superfluous output files from unit tests and updated the md5sums of the output files.

AngryMaciek commented 4 years ago

Before I merge it please make sure that you have complied with all the items from the Checklist: in the initial post of the PR.

AngryMaciek commented 4 years ago

@krish8484 I have cleaned the code a little, indents (🙃), removed unnecessary expanions and importantly added two more sections to all your rules (which I forgot about earlier):