T-Wisse / MEP_Thomas

This repository serves as the documentation platform for my MEP in TU Delft.
1 stars 0 forks source link

Study the genetic interaction data for cell polarity and morphogenesis #4

Open leilaicruz opened 3 years ago

leilaicruz commented 3 years ago

State here the results on that or linked with content of your journal.

leilaicruz commented 3 years ago

Explore the tool https://yeastmine.yeastgenome.org/yeastmine/begin.do to get the genes from certain module (slim go term). You can use the template from slim go terms to gene, and enter a certain slim term like cell budding and you will see the genes related to that.

I have a python script (https://leilaicruz.github.io/jupyter-book/evaluating-protein-domains-per-module.html) that makes a query to download those automatically from your computer without going to the website. You only need an account in Yeastmine and and API key which you can generate in your account details.


def from_go_to_genes(go,label):
    #label=["GOTerm" or "GOSlimTerm"]
    service = Service('https://yeastmine.yeastgenome.org/yeastmine/service', token = 'YOUR-TOKEN')
    query = service.new_query("Gene")
    query.add_constraint("goAnnotation.ontologyTerm.parents", label)
    query.add_view(
        "symbol", "goAnnotation.evidence.code.annotType",
        "goAnnotation.ontologyTerm.parents.name"
    )
    query.add_constraint("goAnnotation.qualifier", "!=", "NOT", code = "C")
    query.add_constraint("goAnnotation.qualifier", "IS NULL", code = "D")
    query.add_constraint("goAnnotation.evidence.code.annotType", "=", "manually curated", code = "F")
    query.add_constraint("goAnnotation.ontologyTerm.parents.name", "=", go, code = "G")
    query.set_logic("(C or D) and F and G")

    data_toy=defaultdict(dict)

    for row,counter in zip(query.rows(),np.arange(0,len(query.rows()))):

        data_toy['gene-name'][counter]=row["symbol"]
        data_toy['evidence'][counter]=row["goAnnotation.evidence.code.annotType"]
        data_toy['annotation'][counter]=row["goAnnotation.ontologyTerm.parents.name"]

    data_toy_pd=pd.DataFrame(data_toy)
    data=data_toy_pd.drop_duplicates()

    data.index=np.arange(0,len(data))
    return data
biological_processes_slim=["cell budding","lipid binding","cytokinesis"]
go=biological_processes_slim[2]
data=from_go_to_genes(go,label='GOSlimTerm')

This an example of a query to yeastmine using this template and save the result in a dataframe. This is a tweak version of the python export of the query you can do in the application itself. image

T-Wisse commented 3 years ago

That has helped a lot. I want to find a list of genes involved in a specific module, in this case cell polarity. I used yeastmine, using the template 'GO Term name [and children of this term] -->All genes' to find all genes involved in establishment of cell polarity. However, now I think this may not a full list as there also has to be maintanance and possibly more. I also messed around with thecellmap.org as they have indicated which region is involved in cell polarity, but I did not find a way to get the list of genes in that module there. Then I also looked at yeastgenome.org for genes annotated to be involved in establishment or maintenance of cell polarity, but that gave me a list of about 6 genes, so that is not complete by any means either. Any suggestions on how/where to find a complete list of genes known to be involved in a functional module would be welcome :)