Open ionpaani opened 8 months ago
Initially when we started documenting information we began in OneNote - the information is being transferred to GitHub. Ahmad had created a skeleton code using ChatGPT based on a prompt in Layman's terms. This is an e.g. of the prompt
I want you to write me code based on the following information
This is an e.g. of the skeleton code which Ahmad, systematically went through to have the code named "final_version.py".
Skeleton code: Here’s a basic outline of the Python script (you’ll need to fill in the details):
import subprocess
def validate_variant(variant):
# Use VariantValidator to validate the variant
# Implement your code here
pass
def get_transcript_info(variant):
# Retrieve the most recent transcript info from VariantValidator
# Implement your code here
pass
def run_vep(transcript_info):
# Run VEP with the transcript info
# Implement your code here
pass
def main():
variant = input("Enter the variant description: ")
# Validate the variant
if validate_variant(variant):
# Get transcript info
transcript_info = get_transcript_info(variant)
# Run VEP
vep_output = run_vep(transcript_info)
# Process VEP output (extract relevant details)
# Implement your code here
print("Variant information:")
# Print relevant details from VEP output
# Implement your code here
else:
print("Invalid variant description. Please check the format.")
if name == "main":
main()
This is the code final_version.py which @Aqeelgene gave me to @ionpaani as task to modify.
import requests import re import json
def fetch_transcript_id(gene_name): url = f"https://rest.variantvalidator.org/VariantValidator/tools/gene2transcripts_v2/{gene_name}/mane_select/refseq/GRCh37?content-type=text/xml" response = requests.get(url) if response.status_code != 200: print(f"Error fetching data from the server. Status code: {response.status_code}") return None
match = re.search(r'<reference type="str">(.*?)</reference>', response.text)
return match.group(1) if match else None
def extract_hg38_genomic_description(response_text): match = re.search(r'"hg38":\s{\s"hgvs_genomic_description":\s*"([^"]+)"', response_text) return match.group(1) if match else None
def get_ensembl_vep_data(hg38_id): url = f"https://rest.ensembl.org/vep/human/hgvs/{hg38_id}" headers = {"Content-Type": "application/json"} response = requests.get(url, headers=headers) if response.status_code != 200: print(f"Error fetching data from Ensembl VEP API. Status code: {response.status_code}") return None return response.json()
server = "https://rest.variantvalidator.org/VariantValidator/variantvalidator/" print("Welcome to VariantBridge!\n\n" "Convert your genetic variants from hg19 to hg38. Please input the variant in HGVS or VCF format (hg19), e.g.:\n" "- HGVS: NM_000088.3:c.589G>T\n" "- HGVS: NC_000017.10:g.48275363C>A\n" "- HGVS: NG_007400.1:g.8638G>T\n" "- VCF: 17-50198002-C-A\n" "- VCF: 17:50198002:C:A\n" "- VCF: chr17:50198002C>A\n" "- VCF: chr17:g.50198002C>A\n\n" "Please enter your data in one of these formats to proceed with the conversion.")
variant = input("Insert variant:")
if variant.startswith('N'): ext = variant.split(':')[0] # Use part of the variant before the colon else: gene_name = input("Enter the gene name: ") ext = fetch_transcript_id(gene_name) if ext is None: print("No transcript ID found for the given gene name.") exit()
response = requests.get(f"{server}hg19/{variant}/{ext}") if response.status_code == 200: hg38_genomic_description = extract_hg38_genomic_description(response.text) print(f"HG38 Genomic Description: {hg38_genomic_description}") if hg38_genomic_description: ensembl_vep_data = get_ensembl_vep_data(hg38_genomic_description) print("Ensembl VEP Data:", ensembl_vep_data) else: print("No hg38 genomic description found, cannot query Ensembl VEP API.") else: print(f"Error: Received response code {response.status_code}")
print(response)
decode = response.json() print(repr(decode))
fileName = "output.txt"
#fileName = makeFileName("variant")
if fileName: file = open(fileName, "w") file.write(json.dumps(decode, sort_keys=True, indent=2)) file.close()
From this point - Ahmad assigned issues
Task 1 - saeeda Task 2 - zahra Task 3 - violetta
Meeting Saturday 17/02/24
Adding comments to team member respective files
We are facing issues with Violetta's Ubuntu / Pycharm.
Violetta has decided to reinstall Pycharm
24th February 2022
Created a new branch ionpaani_3 and did the following changes.
I have modified and renamed file (s). These are:
1. modified: Task1_API.py Added a comment as a titled and also added further comments to explain the start of importing modules. The title comment is:
renamed: Task2_API2.py -> Task1_API2.py Added a comment and renamed this file.
Renamed this because I had it as Task2 when it is really part of Task1. To avoid any confusion I changed this.
modified: variant_bridge/Project_task1.1.py This file is "Task 1" I addressed the issue which Ahmad gave me. To this I added a title comment
modified: variant_bridge/final_version1.py I added a title comment which is the one below. NO other changes have been made. It is exactly how @Aqeelgene had it.
26th February 2024
Up until this point I was having issues with Pycharm, Git and my laptop. I have spoken to the manufacturer and these laptop issues are almost resolved. And for Pycharm and Git, I reinstalled everything.
Work I have completed today
I wanted to practise doing the Git work and saw that @ionpaani had raised a pull request that "develop" needed to be updated. So I asked @ionpaani to make sure I was working from the most up-to-date team develop branch.
I then went to Pycharm and created my own branch Violetta1 from the updated "develop" branch.
@ionpaani had also raised an issue to merge the environment.yml file into the main variant_bridge folder. I did this too and have now closed the issue.
In summary I have discussed and compared changes with @ionpaani before I merged information from two branches (ionpaani3 and develop) and pulled information to update the main team develop branch. I have refactored a file into the main variant_bridge folder
Dear Team members:
Closed issue task1 as it is completed, with comments and titles added per fie.
Created the branch Zahratask2 for completion of the task. I pushed this and the information has now been merged into the develop branch.
I also had created a requirements file specific for Project_task2. This needs to be merged.
@Aqeelgene I have now closed the issue raised asking me to complete task2.
For the purpose of the complete project a log file was created. This was named Project_log.py
This information was tested in my local machine using the variant FBN1 NM_000138.5:c.356G>A however this needs to be tested further therefore a pull request is created for @VIV0503
The team was involved in creating the Project_log.py file. This file logs the steps of coding so it will be easier to track if there are any further errors. The code was tested by all four members systematically during the team meeting on 29th Feb / 1st March. This seemed to work well for all members. The code was then pushed to the develop branch.
Created a new branch Violetta2 to which the references.txt file was added. This was pushed - a pull request created and the information merged into the develop branch
After the reference file was added to the branch Violetta2 a pull request was raised by @VIV0503. I merged this into the develop branch.
@ionpaani
Thank you Saeeda for the excellent efforts in this project!
Created the "developing" branch - please refer to https://github.com/Aqeelgene/Sprint2andProject_007/pull/22
Attempts at test file for Project_task1.1.py and Project_task2.py
Draft code pushed to Zahratesting branch - pull request raised - @ionpaani merged into developing.
The test code is now working for Project_task1_1.py
The reason it wasn't working was because the import request was not presenting the correct file name. As soon as file name was corrected, it started importing functions from correct file hence working now.
Incorrect file name: Project_task1.1.py Correct file name: Project_task1_1.py
Updated my Ahmad_Main_Submission with main and closed the pull request
The team was involved in creating the Project_log.py file. This file logs the steps of coding so it will be easier to track if there are any further errors. The code was tested by all four members systematically during the team meeting on 29th Feb / 1st March. This seemed to work well for all members. The code was then pushed to the develop branch.
Additional information for this comment from @Aqeelgene As a team we were reviewing the code on the “develop” branch for Project_task1.1.py and Project_task2.py files. @Azahra1214 and @VIV0503 , were working through the code in the initial outputs these appeared fine however on closer inspection there were duplicate lines. This was fixed - and then pushed.
[tasklist]
**update 26/02/24 - I am now not clear if we delete the branches as rubric says keep them yet at a tutorial Pete said everything should be on main and rest should be clean. I've restored all our branches and think we should just document everything.
**update 03/03/24 please see the later comments. develop was closed, and we now have a "development" branch from which we all branched out from.
Early version of @Aqeelgene's code put into the final_version.py main branch. And this needs to be pulled into the develop branch via a pull request
We now have the same code in both main and develop. This keeps the main branch clean. I don't know how to do this but I think there might be something on GitHub as a prompt for you to protect the main branch. I think, you might have a prompt when you open this repo
At this stage -prior to branching off develop - we can open a pull request from develop into main. Keep this pull request open until each small feature has been merged into develop. This has the effect of keeping a space within which dialogue can be added.
From develop branch, new branches can be created and small features added to each branch. This could be a few lines of code that improve the final_version.py file.
Each branch can submit one pull request into develop.
Be careful to only tweak a small part of the code or else face conflicts. Each branch could in theory add one line of code the first time round.
Overall this will create a history of what has been done by each branch (collaborator).
Don't forget to delete your personal branch after it's been merged into develop.
Original image created @ionpaani