LiuzLab / AI_MARRVEL

AI-MARRVEL (AIM) is an AI system for rare genetic disorder diagnosis
GNU General Public License v3.0
8 stars 6 forks source link

[WIP] Create module 3 and split into smaller pieces #44

Closed jylee-bcm closed 3 months ago

jylee-bcm commented 3 months ago

@hyunhwan-bcm Hello, hwan.

I would like to get your review on this idea, as you asked to see the dependency between the features.

This refactoring code split the big function of getAnnotateInfoRow() into several pieces, so you can see they have limited dependency.

def f1(row):
    return getAnnotateInfoRow_3_1(row, genomeRef)

def f2(row):
    return getAnnotateInfoRow_3_2(row, moduleList, decipherSortedDf)

def f3(row):
    return getAnnotateInfoRow_3_3(row, moduleList, gnomadMetricsGeneSortedDf)

def f4(row):
    return getAnnotateInfoRow_3_4(row, moduleList, omimGeneSortedDf)

def f5(row):
    return getAnnotateInfoRow_3_5(
        row,
        clinvarGeneDf,
        clinvarAlleleDf,
        hgmdHPOScoreDf,
        moduleList,
    )

df = df.apply(f1, axis=1, result_type='expand')
df = df.apply(f2, axis=1, result_type='expand')
df = df.apply(f3, axis=1, result_type='expand')
df = df.apply(f4, axis=1, result_type='expand')
annotateInfoDf = df.apply(f5, axis=1, result_type='expand')

Please if you let me know if you find this idea can be helpful, I will finalize the PR

NOTE: This code confirmed there's no runtime error, but could not perform the output difference check yet, to show the idea of code separation quicker

hyunhwan-bcm commented 3 months ago

Sounds like a plan, go for it

jylee-bcm commented 3 months ago

I will upload a proper PR after finishing two testings.

jylee-bcm commented 3 months ago

Finished Integrity test and speed test. I will re upload a newer PR.