ionpaani commented 8 months ago

[tasklist]

[x] Early version of code into final_version.py on main branch
[x] Pull request this into develop branch as a proof of concept.
[x] HISTORY_CHANNEL: Open new Pull Request from develop into main (this will keep open a channel and document the story)
[x] From develop branch we create personal branches
[x] Each tweaks a very little of the code in one file: final_version.py
[x] Each submits a Pull Request into develop branch and is reviewed.
[x] Review and confirmation of merge OR discussion on how a code conflict with develop can be resolved on their before merge.
[ ] After merge each is deleted.
**update 26/02/24 - I am now not clear if we delete the branches as rubric says keep them yet at a tutorial Pete said everything should be on main and rest should be clean. I've restored all our branches and think we should just document everything.
[x] Develop will now have a history of commits from each contributor
[x] Only after develop branch has been tidied up can the pull request into main (the HISTORY_CHANNEL) be completed. But this could also be left open for Pete to review work.
**update 03/03/24 please see the later comments. develop was closed, and we now have a "development" branch from which we all branched out from.

Early version of @Aqeelgene's code put into the final_version.py main branch. And this needs to be pulled into the develop branch via a pull request

We now have the same code in both main and develop. This keeps the main branch clean. I don't know how to do this but I think there might be something on GitHub as a prompt for you to protect the main branch. I think, you might have a prompt when you open this repo

At this stage -prior to branching off develop - we can open a pull request from develop into main. Keep this pull request open until each small feature has been merged into develop. This has the effect of keeping a space within which dialogue can be added.

From develop branch, new branches can be created and small features added to each branch. This could be a few lines of code that improve the final_version.py file.

Each branch can submit one pull request into develop.

Be careful to only tweak a small part of the code or else face conflicts. Each branch could in theory add one line of code the first time round.

Overall this will create a history of what has been done by each branch (collaborator).

Don't forget to delete your personal branch after it's been merged into develop.

Original image created @ionpaani

ionpaani commented 8 months ago

Initially when we started documenting information we began in OneNote - the information is being transferred to GitHub. Ahmad had created a skeleton code using ChatGPT based on a prompt in Layman's terms. This is an e.g. of the prompt

I want you to write me code based on the following information

I will give you a variant.
Check the variant in variant validator and get the most recent transcript and use the result from variant validator transcript and put it in VEP and get the information for that variant.
Give me code that I can put into Pycharm which is within Ubuntu Linux

ionpaani commented 8 months ago

This is an e.g. of the skeleton code which Ahmad, systematically went through to have the code named "final_version.py".

Skeleton code: Here’s a basic outline of the Python script (you’ll need to fill in the details):

Example Python script

import subprocess

def validate_variant(variant):

# Use VariantValidator to validate the variant 

# Implement your code here 

pass

def get_transcript_info(variant):

# Retrieve the most recent transcript info from VariantValidator 

# Implement your code here 

pass

def run_vep(transcript_info):

# Run VEP with the transcript info 

# Implement your code here 

pass

def main():

variant = input("Enter the variant description: ") 

# Validate the variant 

if validate_variant(variant): 

    # Get transcript info 

    transcript_info = get_transcript_info(variant) 

    # Run VEP 

    vep_output = run_vep(transcript_info) 

    # Process VEP output (extract relevant details) 

    # Implement your code here 

    print("Variant information:") 

    # Print relevant details from VEP output 

    # Implement your code here 

else: 

    print("Invalid variant description. Please check the format.")

if name == "main":

main()

ionpaani commented 8 months ago

This is the code final_version.py which @Aqeelgene gave me to @ionpaani as task to modify.

import requests import re import json

Function to fetch transcript ID using gene name

def fetch_transcript_id(gene_name): url = f"https://rest.variantvalidator.org/VariantValidator/tools/gene2transcripts_v2/{gene_name}/mane_select/refseq/GRCh37?content-type=text/xml" response = requests.get(url) if response.status_code != 200: print(f"Error fetching data from the server. Status code: {response.status_code}") return None

match = re.search(r'<reference type="str">(.*?)</reference>', response.text)
return match.group(1) if match else None

Function to extract the hg38 genomic description using regex

def extract_hg38_genomic_description(response_text): match = re.search(r'"hg38":\s{\s"hgvs_genomic_description":\s*"([^"]+)"', response_text) return match.group(1) if match else None

Function to get data from Ensembl VEP API using hg38 ID

def get_ensembl_vep_data(hg38_id): url = f"https://rest.ensembl.org/vep/human/hgvs/{hg38_id}" headers = {"Content-Type": "application/json"} response = requests.get(url, headers=headers) if response.status_code != 200: print(f"Error fetching data from Ensembl VEP API. Status code: {response.status_code}") return None return response.json()

Main code

server = "https://rest.variantvalidator.org/VariantValidator/variantvalidator/" print("Welcome to VariantBridge!\n\n" "Convert your genetic variants from hg19 to hg38. Please input the variant in HGVS or VCF format (hg19), e.g.:\n" "- HGVS: NM_000088.3:c.589G>T\n" "- HGVS: NC_000017.10:g.48275363C>A\n" "- HGVS: NG_007400.1:g.8638G>T\n" "- VCF: 17-50198002-C-A\n" "- VCF: 17:50198002:C:A\n" "- VCF: chr17:50198002C>A\n" "- VCF: chr17:g.50198002C>A\n\n" "Please enter your data in one of these formats to proceed with the conversion.")

variant = input("Insert variant:")

Check if variant starts with 'N'

if variant.startswith('N'): ext = variant.split(':')[0] # Use part of the variant before the colon else: gene_name = input("Enter the gene name: ") ext = fetch_transcript_id(gene_name) if ext is None: print("No transcript ID found for the given gene name.") exit()

response = requests.get(f"{server}hg19/{variant}/{ext}") if response.status_code == 200: hg38_genomic_description = extract_hg38_genomic_description(response.text) print(f"HG38 Genomic Description: {hg38_genomic_description}") if hg38_genomic_description: ensembl_vep_data = get_ensembl_vep_data(hg38_genomic_description) print("Ensembl VEP Data:", ensembl_vep_data) else: print("No hg38 genomic description found, cannot query Ensembl VEP API.") else: print(f"Error: Received response code {response.status_code}")

print(response)

Json output format

decode = response.json() print(repr(decode))

Write the json to file

fileName = "output.txt"

if saveAnnotations:

#fileName = makeFileName("variant")

if fileName: file = open(fileName, "w") file.write(json.dumps(decode, sort_keys=True, indent=2)) file.close()

ionpaani commented 8 months ago

From this point - Ahmad assigned issues

Task 1 - saeeda Task 2 - zahra Task 3 - violetta

ionpaani commented 8 months ago

Meeting Saturday 17/02/24

Adding comments to team member respective files

ionpaani commented 8 months ago

@Aqeelgene created his own branch AhmadDevelop1
He updated the final_version.py file with additional comments.
This was then pushed to "Develop" branch
I pulled the information and then updated the Project_task1.py file with comments
The files were renamed (1) final_version.py file to final_version1.py and (2) Project_task1.py to Project_task1.1.py
This was pushed to the remote "ionpaani_2" branch
compared to "Develop" base branch and the pull request merged
Violetta pulled the "Develop" base branch to her local machine
Created her own branch "ViolettaDevelop1"

ionpaani commented 8 months ago

We are facing issues with Violetta's Ubuntu / Pycharm.

Pycharm keeps crashing and diffculty in getting the interpretor
The Ubuntu interface almost seems like it's about to crash then doesn't
Finally Violetta managed to pull down the information from "Develop" however
Cannot seem to get Git to work in any capacity - either in terminal or via Pycharm itself

Violetta has decided to reinstall Pycharm

ionpaani commented 8 months ago

24th February 2022

Created a new branch ionpaani_3 and did the following changes.

I have modified and renamed file (s). These are:

1. modified: Task1_API.py Added a comment as a titled and also added further comments to explain the start of importing modules. The title comment is:

This file Task1_API.py is an attempt to create an API. I tried this as a follow on from Project_task1.1.py. I also added comments to this 24th February 2024

renamed: Task2_API2.py -> Task1_API2.py Added a comment and renamed this file.

This file "Task1_API2" is a follow on from "Task1_API" and was an attempt to create fields to enter the required information. I didn't take this any further

Renamed this because I had it as Task2 when it is really part of Task1. To avoid any confusion I changed this.
modified: variant_bridge/Project_task1.1.py This file is "Task 1" I addressed the issue which Ahmad gave me. To this I added a title comment

This is the Project_task1.1.py to which Saeeda added comments on 17 Feb 2024
modified: variant_bridge/final_version1.py I added a title comment which is the one below. NO other changes have been made. It is exactly how @Aqeelgene had it.

This is the final_version1.py to which Ahmad added comments on 17 Feb 2024

VIV0503 commented 8 months ago

26th February 2024

Up until this point I was having issues with Pycharm, Git and my laptop. I have spoken to the manufacturer and these laptop issues are almost resolved. And for Pycharm and Git, I reinstalled everything.

Work I have completed today

I wanted to practise doing the Git work and saw that @ionpaani had raised a pull request that "develop" needed to be updated. So I asked @ionpaani to make sure I was working from the most up-to-date team develop branch.

I then went to Pycharm and created my own branch Violetta1 from the updated "develop" branch.

@ionpaani had also raised an issue to merge the environment.yml file into the main variant_bridge folder. I did this too and have now closed the issue.

In summary I have discussed and compared changes with @ionpaani before I merged information from two branches (ionpaani3 and develop) and pulled information to update the main team develop branch. I have refactored a file into the main variant_bridge folder

Aqeelgene commented 8 months ago

Dear Team members:

It is simply an honour to work with such a great team in a very productive an creative environment; Thanks a million to give such an honour.
For Tasks: as greed before and mentioned above by Saeeda: Task 1 is for Saeeda; Task 2 is for Zahra and Task 3 is for Violetta.
For the issues facing Violetta, it is wise to help Violetta and in parallel finishing Tasks 1 and 2 so then we all go and help Violetta to finish task 3.
It is also important to keep testing every step to make sure that the code works properly.
Thank you Saeeda a million times to coordinate the work-load after each meeting ---> You make our lives and work SO MUCH EASIER.
I want to remind everyone, me 1st and always; the benefits of working as a team and how far we together reached so far in this project; especially if the team works is a family-oriented environment... So let's keep the excellent work!

ionpaani commented 7 months ago

Closed issue task1 as it is completed, with comments and titles added per fie.

Azahra1214 commented 7 months ago

Created the branch Zahratask2 for completion of the task. I pushed this and the information has now been merged into the develop branch.

I also had created a requirements file specific for Project_task2. This needs to be merged.

@Aqeelgene I have now closed the issue raised asking me to complete task2.

Aqeelgene commented 7 months ago

For the purpose of the complete project a log file was created. This was named Project_log.py

This information was tested in my local machine using the variant FBN1 NM_000138.5:c.356G>A however this needs to be tested further therefore a pull request is created for @VIV0503

Aqeelgene commented 7 months ago

The team was involved in creating the Project_log.py file. This file logs the steps of coding so it will be easier to track if there are any further errors. The code was tested by all four members systematically during the team meeting on 29th Feb / 1st March. This seemed to work well for all members. The code was then pushed to the develop branch.

VIV0503 commented 7 months ago

Created a new branch Violetta2 to which the references.txt file was added. This was pushed - a pull request created and the information merged into the develop branch

ionpaani commented 7 months ago

After the reference file was added to the branch Violetta2 a pull request was raised by @VIV0503. I merged this into the develop branch.

Aqeelgene commented 7 months ago

@ionpaani

Thank you Saeeda for the excellent efforts in this project!

ionpaani commented 7 months ago

Created the "developing" branch - please refer to https://github.com/Aqeelgene/Sprint2andProject_007/pull/22

Azahra1214 commented 7 months ago

Attempts at test file for Project_task1.1.py and Project_task2.py

Draft code pushed to Zahratesting branch - pull request raised - @ionpaani merged into developing.

Azahra1214 commented 7 months ago

The test code is now working for Project_task1_1.py

The reason it wasn't working was because the import request was not presenting the correct file name. As soon as file name was corrected, it started importing functions from correct file hence working now.

Incorrect file name: Project_task1.1.py Correct file name: Project_task1_1.py

Aqeelgene commented 7 months ago

Updated my Ahmad_Main_Submission with main and closed the pull request

ionpaani commented 7 months ago

please refer to https://github.com/Aqeelgene/Sprint2andProject_007/pull/22 and https://github.com/Aqeelgene/Sprint2andProject_007/pull/26

ionpaani commented 7 months ago

The team was involved in creating the Project_log.py file. This file logs the steps of coding so it will be easier to track if there are any further errors. The code was tested by all four members systematically during the team meeting on 29th Feb / 1st March. This seemed to work well for all members. The code was then pushed to the develop branch.

Additional information for this comment from @Aqeelgene As a team we were reviewing the code on the “develop” branch for Project_task1.1.py and Project_task2.py files. @Azahra1214 and @VIV0503 , were working through the code in the initial outputs these appeared fine however on closer inspection there were duplicate lines. This was fixed - and then pushed.

Aqeelgene / Sprint2andProject_007

Setting up our project and tracking information #6

Example Python script

Function to fetch transcript ID using gene name

Function to extract the hg38 genomic description using regex

Function to get data from Ensembl VEP API using hg38 ID

Main code

Check if variant starts with 'N'

Json output format

Write the json to file

if saveAnnotations:

This file Task1_API.py is an attempt to create an API. I tried this as a follow on from Project_task1.1.py. I also added comments to this 24th February 2024

This file "Task1_API2" is a follow on from "Task1_API" and was an attempt to create fields to enter the required information. I didn't take this any further

This is the Project_task1.1.py to which Saeeda added comments on 17 Feb 2024

This is the final_version1.py to which Ahmad added comments on 17 Feb 2024