jhelvy / logitr

Fast estimation of multinomial (MNL) and mixed logit (MXL) models in R with "Preference" space or "Willingness-to-pay" (WTP) space utility parameterizations in R
https://jhelvy.github.io/logitr/
Other
42 stars 15 forks source link

Usage help #47

Closed ndrubins closed 1 year ago

ndrubins commented 1 year ago

Hi,

I've been trying to use nnets multinom' for my problem but logitr seems like a much better approach, but I'd be grateful to get some help on usage.

My problem comes from single-cell sequencing in biology. In our experiment we take an organ, such as the spleen, from several animals (samples: biological replicates) with covariates (e.g., a categorical variable such as age with two levels: young and 'old) and we dissociate it to its single cells, and then we profile the gene expression in each of these cells, such that the readout data (after some processing and analyzing), is a table with these columns:

So these are clearly compositional data because in each sample we get a distribution of cells across the cell_types, which sum up to 100%.

My goal is to test for age effects in each of the cell_types, or in other words wether the estimated old/young ratio in each cell_type is different from 1.

In my data I have a total of 38168 cells from three young samples and four old samples, each assigned to one of five different cell_types.

I constructed the input data.frame such that id is an integer that encodes cell, obsID is identical to id because a cell is only observed once and hence assigned to a single cell_type once, alt encodes cell_type, choice has a value of 1 to the cell_type of cell and 0 for all other 4 cell_type's, and then age and sample_ID are factors.

Here are the two first ids in the input data.frame:

id obsID  alt              choice age   sample
1     1       NKT.cell      1         young young_2
1     1       CD4.T.cell   0        young young_2
1     1       CD8.T.cell   0        young young_2
1     1       Treg             0       young young_2
1     1       NK.cell         0       young young_2
2     2      NKT.cell      0        young young_2
2     2      CD4.T.cell   1        young young_2
2     2      CD8.T.cell   0       young young_2
2     2      Treg            0        young young_2
2     2      NK.cell        0        young young_2

Then I run this logitr model command:

logitr(data = df, outcome = "choice", obsID = "obsID", pars = c("age", "sample"), randPars = c(sample = 'n'), drawType = 'sobol', numDraws = 200,numMultiStarts = 10)

The output only reports on the age and sample coefficients but not on alt.

So my question is if it is actually possible to the age effect for each alt? I guess it'd be an interaction between alt and age but I don't see how that can be specified to logitr.

Thanks a lot

jhelvy commented 1 year ago

I think you want pars = c("age*alt", "sample"). See the last part of this section which talks about interactions.

ndrubins commented 1 year ago

Thanks!