rmarkello / abagen

A toolbox for working with Allen Human Brain Atlas microarray expression data
https://abagen.readthedocs.io
BSD 3-Clause "New" or "Revised" License
93 stars 41 forks source link

Differential stability values #198

Closed gkleinman84 closed 3 years ago

gkleinman84 commented 3 years ago

Hi, is it possible to retrieve the differential stability values for each gene that are calculated during the parcellation process? Cheers!

rmarkello commented 3 years ago

Hi @gkleinman84 ! Welcome to abagen :wave:

Unfortunately the differential stability estimates aren't directly accessible from the abagen.get_expression_data() workflow; however, you can obtain a variation of them via the following:

>>> import abagen
>>> atlas = abagen.fetch_desikan_killiany()
>>> expression = abagen.get_expression_data(atlas['image'], atlas['info'], return_donors=True)
>>> stability = abagen.correct.keep_stable_genes(list(expression.values()), return_stability=True)[1]
>>> expression = abagen.samples_.aggregate_samples(expression)

Here, stability will be an array where each entry indicates the differential stability of the corresponding gene in the columns of the expression matrix. This is slightly different from the differential stability estimates calculated internally during abagen.get_expression_data(), as here the stability estimates are calculated as correlations between parcels in atlas rather than between AHBA-defined brain structures from their ontology; however, the values tend to be very similar!

Hope this helps, but let me know if you have any questions.

gkleinman84 commented 3 years ago

Hi @rmarkello,

Thank you very much for your answer! This already helps a lot. When you say "the values tend to be very similar", can you estimate whether they are less or more conservative or generally just differ slightly without a direction?

Is the last part of the code (expression = abagen.samples_.aggregate_samples(expression)) necessary for the DS values or is that unrelated?

rmarkello commented 3 years ago

Hi @gkleinman84 ! Sorry for the delay in getting back to this.

Yes, you're absolutely right: that last line is unrelated to the DS calculations, It's just for aggregating the donor-level expression values into a single group-average expression matrix (so you don't have to re-run the abagen.get_expression_data() workflow again). If you don't need the expression data then you can safely ignore that!

rmarkello commented 3 years ago

As for your question of DS value similarity: I can't say with any certainty whether one estimate is more or less conversative, but you could dig into the code to try it out yourself!

The relevant function that calculates the DS from the sample level information is abagen.probes_._diff_stability(). You could step through the main abagen.get_expression_data() workflow by hand and then modify that function to return the DS values directly, if you wanted to compare them!