zavolanlab / bindz-rbp

RBP module for bindz, a bioinformatics tool to detect regulators' binding sites on RNA sequences.
https://github.com/zavolanlab/bindz-rbp
Apache License 2.0
6 stars 1 forks source link

Combine motif results #17

Closed krish8484 closed 4 years ago

krish8484 commented 4 years ago

Description

Solves #15

Fixes # (issue)

Type of change

Please delete options that are not relevant.

Checklist:

krish8484 commented 4 years ago

I have converted this to a draft PR because there are some things that need to change. For example, I didn't know the path of the posterior files in the folders so just to check my code I have added the dev folder (which will be deleted later, in this PR). The code seems to be running fine, I have added the example tsv file which it generated taking the dev folder as input. I have also commented other snakemake rules to just check if my rule is running appropriately and it is, @AngryMaciek you can review it once and let me know if anything needs to be changed. Also, what will be the path of the input files(posterior_site files)? I will have to edit that in the snakemake file of this branch and then uncomment all the rules, so that the test pass and also delete the dev folder after the path of posterior_site files are known and then make this PR final Simply typing snakemake --cores 1 in the terminal will run the latest rule. Thanks!

AngryMaciek commented 4 years ago

A few initial comments I can already make:

  1. Please clone this repository and create a new branch from dev under this repository. I do not keep track what do you do in your personal fork, and is it up to date with this one, for the sake of completeness here we always work from and merge to this repository.
  2. As for the testing - please create a proper test for your script under tests/unit, just like I do for one other bash script. You may generate a dummy input by running a pipeline in its current state and inspecting the output. Please select 5 directories under {output}/motevo/ from the output and add them as a input for a unit test for your scripts. Run the test locally and then create a list of outuput files and their md5 sums (just like I do for the other unit test). After that please add testing your own script and md5 comparison to the CI (just as for the other bash script). So just please follow the same strategy as for the other scripts.
  3. Commenting-and-uncommenting is not a proper test for a snakemake rule. Also the command you provided is not a correct way to call this pipeline. Please use the calls in scripts integration/execution. Once you are sure that your script works fine (the previous point) then you have to adjust the snakefile to add your rule at the end. The results of your rule will be gathered by the all rule. So indeed you will expand the pipeline by one more step at the end.
  4. After you do that the integration test will be out of date, so you will have to update the files and their md5 sums in test/integration as well.

I can explain this in more detail on a meeting tomorrow. There are two tasks here: (1) add your script, input data and tests for it, (2) expand snakemake pipeline with the script.

krish8484 commented 4 years ago

Yes, I agree with the above points I know that the tests need to be added and snakefile needs to be changed, that is why this is still a draft PR. I had also made a branch on this repository but could not push my code to remote because i dont have write access or write permission for this repository, thats why I had to send in the code using fork.

krish8484 commented 4 years ago

We will discuss this at the meeting tomorrow.