Ditto pipeline just after CAGI6 project - [merged] - Githubissues

uab-cgds-worthey / DITTO

Variant Deleteriousness prediction tool using AI

GNU General Public License v3.0

1 stars 0 forks source link

Ditto pipeline just after CAGI6 project - [merged] #13

Closed ManavalanG closed 1 year ago

ManavalanG commented 3 years ago

In GitLab by @tkmamidi on Oct 18, 2021, 17:09

Merges CAGI6-version -> master

This is a pipeline to run Ditto predictions for sample/s. We successfully used this to analyze samples from CAGI6 project.

Note: The commit used for CAGI6 challenge pipeline is be97cf5dbfcb099ac82ef28d5d8b0919f28aed99. It was used along with annotated VCFs and exomiser scores obtained from rgp_cagi6 workflow.

Things to check -

[x] Check the ReadMe
[x] Check if scripts for VEP annotation are present
[x] Check if scripts for VEP annotation parsing are present
[x] Check if scripts for Ditto filtering and parsing are present

ManavalanG commented 3 years ago

Expanding on this a bit would be helpful as this is one of the first things typically looked at to learn what the repo is about.

ManavalanG commented 3 years ago

This repo uses git submodules (for example, variant_annotation/configs/snakemake_slurm_profile). In order to retrieve them as well, we need to use flag --recurse-submodules here.

ManavalanG commented 3 years ago

How about moving this file to src dir?

ManavalanG commented 3 years ago

Minor: pip libraries are second class citizens in the conda world and they need to be avoided if and when possible. Since black and lz4 are available from conda-forge, I would recommend switching to them.

ManavalanG commented 3 years ago

Minor: Missing version of gpy

ManavalanG commented 3 years ago

Minor: A quick note on why this get removed would be helpful.

ManavalanG commented 3 years ago

Minor: As testing conda env already contains bcftools, we could just use that and skip use of module loading.

ManavalanG commented 3 years ago

Minor: Including conda env used here would be helpful. Applies to rest of the doc as well.

ManavalanG commented 3 years ago

Referring to cagi6-rgp repo as an example would be helpful here for reference purposes. Or, in the next section Cohort level analysis, you could mention that exomiser scores were utilized.

ManavalanG commented 3 years ago

Please refer to [CAGI6-RGP](https://gitlab.rc.uab.edu/center-for-computational-genomics-and-data-science/sciops/mana/mini_projects/rgp_cagi6) project for filtering and annotation.

ManavalanG commented 3 years ago

You may want to clarify what filtering refers to here so as to avoid the downstream filtering as part of ditto.

ManavalanG commented 3 years ago

@tkmamidi Ready for your eyes :)

Comments marked as minor are good to be taken care of at this point, but if you prefer to do it after this MR is merged, I'm okay with that as well.

ManavalanG commented 3 years ago

Few questions. When do you plan to adopt the following?

Auto-formatting python scripts using black
Linting using pylint, eslint, markdown lint, etc.

If you prefer to do it post this MR, please create issues to keep track of them.

ManavalanG commented 3 years ago

Commit used for CAGI6 pipeline - be97cf5dbfcb099ac82ef28d5d8b0919f28aed99

I would recommend clarifying this part in MR description. Perhaps something like:

Note: The commit used for CAGI6 challenge pipeline is be97cf5dbfcb099ac82ef28d5d8b0919f28aed99. It was used along with annotated VCFs and exomiser scores obtained from rgp_cagi6 workflow.

ManavalanG commented 3 years ago

In GitLab by @tkmamidi on Oct 21, 2021, 24:14

Commented on README.md line 20

changed this line in version 2 of the diff

ManavalanG commented 3 years ago

In GitLab by @tkmamidi on Oct 21, 2021, 24:14

Commented on predict_variant_score.sh line 1

changed this line in version 2 of the diff

ManavalanG commented 3 years ago

In GitLab by @tkmamidi on Oct 21, 2021, 24:14

Commented on configs/envs/testing.yaml line 22

changed this line in version 2 of the diff

ManavalanG commented 3 years ago

In GitLab by @tkmamidi on Oct 21, 2021, 24:14

Commented on configs/envs/testing.yaml line 21

changed this line in version 2 of the diff

ManavalanG commented 3 years ago

In GitLab by @tkmamidi on Oct 21, 2021, 24:14

Commented on README.md line 70

changed this line in version 2 of the diff

ManavalanG commented 3 years ago

In GitLab by @tkmamidi on Oct 21, 2021, 24:14

Commented on README.md line 73

changed this line in version 2 of the diff

ManavalanG commented 3 years ago

In GitLab by @tkmamidi on Oct 21, 2021, 24:14

Commented on README.md line 125

changed this line in version 2 of the diff

ManavalanG commented 3 years ago

In GitLab by @tkmamidi on Oct 21, 2021, 24:14

added 1 commit

06e3d2ed - updating readme and linting

Compare with previous version

ManavalanG commented 3 years ago

In GitLab by @tkmamidi on Oct 21, 2021, 24:17

Frankly, I don't like formatting the scripts but did it anyway for this MR. It makes the script look so confusing. Maybe I'll get adapted to it eventually? only time will tell :D

ManavalanG commented 3 years ago

In GitLab by @tkmamidi on Oct 21, 2021, 24:20

added 1 commit

ac1d07c3 - update about * alleles

Compare with previous version

ManavalanG commented 3 years ago

marked the checklist item Check the ReadMe as completed

ManavalanG commented 3 years ago

I see you added this to Readme.md, which is not a bad idea. I was referring to clarifying it in MR description though.

ManavalanG commented 3 years ago

Btw, I modified the "note" in the parent comment for better clarity.

ManavalanG commented 3 years ago

What was the python code formatter used? I don't see some typical modifications that black would make. Hence the question.

Formatting can be subjective sometimes and there are times I don't like the choices that black makes. But they are rather infrequent and so I chose to live with it. If you don't like black, check to see if you like other code formatters out there. In my experience, have some combination of linters and formatters go a long way in making your codebase manageable over time, irrespective of number of code contributors.

ManavalanG commented 3 years ago

Linting using pylint, eslint, markdown lint, etc.

could you comment on these?

ManavalanG commented 3 years ago

added 1 commit

265f3832 - formats using black

Compare with previous version

ManavalanG commented 3 years ago

In GitLab by @tkmamidi on Oct 21, 2021, 18:02

marked the checklist item Check if scripts for VEP annotation are present as completed

ManavalanG commented 3 years ago

In GitLab by @tkmamidi on Oct 21, 2021, 18:02

marked the checklist item Check if scripts for VEP annotation parsing are present as completed

ManavalanG commented 3 years ago

In GitLab by @tkmamidi on Oct 21, 2021, 18:02

marked the checklist item Check if scripts for Ditto filtering and parsing are present as completed

ManavalanG commented 3 years ago

Same concern as discussed for testing.yaml.

ManavalanG commented 3 years ago

In GitLab by @tkmamidi on Oct 25, 2021, 13:55

Commented on README.md line 93

we have "activate conda environment" before these steps. Do we still need to mention for every step?

ManavalanG commented 3 years ago

In GitLab by @tkmamidi on Oct 25, 2021, 13:57

Commented on README.md line 116

mentioned in the next section "Cohort level analysis"

ManavalanG commented 3 years ago

In GitLab by @tkmamidi on Oct 25, 2021, 13:57

I'm using pylint for now.

ManavalanG commented 3 years ago

In GitLab by @tkmamidi on Oct 25, 2021, 13:58

added to MR description

ManavalanG commented 3 years ago

In GitLab by @tkmamidi on Oct 25, 2021, 14:00

Commented on configs/envs/environment.yaml line 26

A lot of changes need to be done with tuning. Can I get back to this later? Also, this Readme doesn't talk about this envi or training pipeline. Thoughts?

ManavalanG commented 3 years ago

So all the commands that follow would be run in that conda environment?

ManavalanG commented 3 years ago

Oh good.. How about eslint and markdown lint?

ManavalanG commented 3 years ago

In GitLab by @tkmamidi on Oct 25, 2021, 14:10

Commented on README.md line 93

Yes sir

ManavalanG commented 3 years ago

In GitLab by @tkmamidi on Oct 25, 2021, 14:10

I'll try them after my Quals?

ManavalanG commented 3 years ago

Sure but please add an issue(s) for them.

ManavalanG commented 3 years ago

Could you update that text to match as that in MR description? The current one is a bit confusing (I know I was the source of that confusion :laughing: ).

ManavalanG commented 3 years ago

Please file issues as needed to track them. Also, I would suggest creating an empty section for those topics in the readme doc, with just todo as text in them.

ManavalanG commented 3 years ago

A lot of changes need to be done with tuning. Can I get back to this later?

Sounds good, but please file it as an issue.

ManavalanG commented 3 years ago

In GitLab by @tkmamidi on Oct 25, 2021, 15:17

done

ManavalanG commented 3 years ago

Awesome!

PS- #9

ManavalanG commented 3 years ago

In GitLab by @tkmamidi on Oct 25, 2021, 15:28

Commented on README.md line 116

I have the same note in ReadMe

Next