Penetrance estimation for multiple outcomes

BoPeng commented 2 years ago

Hopefully I am posting the question on the right repo.

We would like to estimate the penetrance of genetic factors on pedigrees with multiple outcomes, for example different types of cancers at different ages. We could combine multiple outcomes into simple AFFECTED and NORMAL but we would be missing important information on the penetrance of the mutation. Can classic Mendel or any projects in OpenMendel handle this?

Hua-Zhou commented 2 years ago

@BoPeng Can you elaborate a bit on your research question? Are you interested in the penetrance of mutation on the individual, competing cancer types, or the penetrance of mutation on the union of these cancer types, or else? In other words, what is the phenotype(s) you want to estimate the penetrance for? More details on the specific cancer types and data sets can help us understand better too.

BoPeng commented 2 years ago

We have data on a few large families with TP53 mutations. Family members can get one or more cancers at different ages. We are interested in estimating the penetrance of these mutations by considering not only "affection status" but also when and how many cancers they got.

We have run the data through version 16 of the classic Mendel by only considering "affection status" (ignore age of onset and multiple outcomes) and we are having difficulties interpreting the results. For example, we could not find the probability of affection for given mutation, could not find a hazards ratio, and we do not know what the p-value mean in its output. Any suggestion on how we should proceed would be highly appreciated.

BoPeng commented 2 years ago

We followed example 14a of the Mendel distribution. Its output is

                     PENETRANCE ESTIMATION OPTION

 LOCUS NAME:     DISEASE 

 PARAMETER      ESTIMATE     STANDARD ERROR
  GRAND         -2.4895        1.0299    
  1/1           -2.8260        1.3092    
  1/2            1.4121        1.1913    
  2/2            1.4138        2.2315    
  FREQ-1        0.95240       0.35969E-01
  FREQ-2        0.47596E-01   0.35969E-01

 LNLIKELIHOOD =  -124.4368    
 LIKELIHOOD RATIO TEST INAPPROPRIATE

but we have no idea how to interpret the results....

Hua-Zhou commented 2 years ago

@BoPeng Thanks for the details. It'll be helpful if you can also include the control file, so we know your model specification.

BoPeng commented 2 years ago

We followed the example 14a closely, without releasing patient information, we have

Control.in, without considering age, age of onset, and number of cancers,

DEFINITION_FILE = data.DEF
MAP_FILE = data.MAP
OUTPUT_FILE = data.OUT
SUMMARY_FILE = data.SUMMARY
ECHO = No
PEDIGREE_FILE = data.PED
MAP_DISTANCE_UNITS = bp
ANALYSIS_OPTION = Penetrances
QUANTITATIVE_TRAIT = Cancer
PENETRANCE_MODEL = Binomial :: Distribution
PREDICTOR = DOMINANT :: Cancer
PARAMETER_INITIAL_VALUE = -2.0 :: 1/1
PARAMETER_INITIAL_VALUE = 2.0 :: 2/2

data.DEF file, with severity being the number of cancers

c.542G>A, AUTOSOME, 2
  G, 0.999980
  A, 0.000020
Age, Variable, 2
  Minimum,   0.00000
  Maximum,   110.000
AgeDX, Variable, 2
  Minimum,   0.00000
  Maximum,   110.000
Cancer, Variable, 2
  Minimum   ,   0.00000
  Maximum   ,   1.00000
Severity, Variable, 2
  Minimum   ,   0.00000
  Maximum   ,   10.00000

data.MAP file

c.542G>A, 7675070

data.PED file (partial and modified)

FAM,        1,    2,    3,  F,      ,   G|A,  38,   0,  0,  0,
FAM,        2,    4,    5,  F,      ,      ,  60,   0,  0,  0,

And data.summary file


                     PENETRANCE ESTIMATION OPTION

 LOCUS NAME:     c.542G>A

 PARAMETER      ESTIMATE     STANDARD ERROR
  GRAND         -1.9350       0.56051    
  1/1           -1.2050       0.61292    
  1/2&2/2        1.2050       0.61292    

 LNLIKELIHOOD =  -41.11136    
 LIKELIHOOD RATIO STATISTIC =  4.71491
 DEGREES OF FREEDOM =    1
 ASYMPTOTIC P-VALUE =  0.02990214

which we do not know how to interpret.

OpenMendel / ClassicMendel

Penetrance estimation for multiple outcomes #1