Math behind MixSIAR - Githubissues

KoleyFreeman commented 4 years ago

Hi,

I was hoping that someone could provide an explanation for a trend I am observing in my data. In my model I have carbon and nitrogen signatures for ~200 individuals and for 3 grouped sources ( plant, arthropod and vertebrates). I have loaded the source isotopic signatures as raw data in my model and included one discrimination factor that is the same for each source. In the model I have included individual ID as a factor and used an uninformative prior. I have run the model several times at the 'test', 'normal' and 'long' run lengths and each time the model has converged. All of the variables fell below 1.05 for the Gelman-Rubin diagnostic and each chain was consistent where only 6% of the variables were outside of the +/- 1.96 range for the Geweke diagnostic.

The output of the model is that the diet is primarily vertebrate (mean +/- sd: 50 +/- 7.2%, range: 34-70%), then plant (35 +/- 6.0%, 17-49%) and finally, arthropod (15 +/- 1.7%, 10-20%). What I am confused by is why the proportion of the arthropods is so low. So many of the consumer's signatures are close to that of the arthropods yet the range of arthropods in the diet is only 10-20%. Shouldn't individuals which closely overlap the arthropod signatures have a higher proportion of arthropods in the diet? Any clarification regarding either the math behind the models or rationale as to why arthropods are consistently low in the diet is much appreciated.

I have attached the isospace plot generated with the data for reference. Isoplot.pdf

Thank you in advance for your help!

Koley

brianstock commented 4 years ago

Your mean estimates make sense to me given your isoplot: most of your consumers lie almost on a straight line between vertebrates and plants, with some more towards arthropods. I wouldn't expect the mean to have a high %Arthropod, but some individuals might, as you wrote. What about the %Arthropod for individuals, how variable is it?

Other thoughts:

Is individual included as a fixed or random effect? If a fixed effect, an independent offset term is being estimated for each consumer, and the individual %Arthropods will be more variable. If a random effect, the individual diet p's are shrunk towards the overall mean p's (offsets shrunk to 0).
The "uninformative"/generalist prior is 1/3, 1/3, 1/3. The weight of the prior reduces as you add more data. With 200 data points, your data should be swamping the prior. But see below:
The sources (+TDF) have pretty high uncertainty relative to the distance between them. MixSIAR fits the source means/SDs, so they can wiggle a bit from their values in the isoplot to maximize the likelihood of your observed data. You can turn off this effect and see if it changes the results - either input the source data as means/sd/n with very high n (=1000), or replicate your raw source data so that the sample size is high.

AndrewLJackson commented 4 years ago

The midpoint of that cloud of consumers looks very much like it would be in a line almost with verts and plants and a little off towards Arthropoda. Since plants are further away you only need relatively a proportion to explain the consumers compared with verts. A small amount of Arthropods are needed to explain the slight left shift off this line. Those results make good sense to me. Reasonably large between-individual errors too given the spread.

––––––––––––

Dr Andrew Jackson, PhD, FTCD Associate Professor School of Natural Sciences, Department of Zoology Trinity College Dublin, the University of Dublin Dublin 2, Ireland.

+353 1 896 2728 | a.jackson@tcd.iemailto:a.jackson@tcd.ie

Twitter: @yodacomplexhttps://twitter.com/yodacomplex http://www.tcd.ie/Zoology/research/research/theoretical/AndrewJackson.php

Trinity College Dublin, the University of Dublin is ranked 1st in Ireland and in the top 100 world universities by the QS World University Rankings.

On 27 Jan 2020, at 17:57, KoleyFreeman notifications@github.com wrote:

Hi,

I was hoping that someone could provide an explanation for a trend I am observing in my data. In my model I have carbon and nitrogen signatures for ~200 individuals and for 3 grouped sources ( plant, arthropod and vertebrates). I have loaded the source isotopic signatures as raw data in my model and included one discrimination factor that is the same for each source. In the model I have included individual ID as a factor and used an uninformative prior. I have run the model several times at the 'test', 'normal' and 'long' run lengths and each time the model has converged. All of the variables fell below 1.05 for the Gelman-Rubin diagnostic and each chain was consistent where only 6% of the variables were outside of the +/- 1.96 range for the Geweke diagnostic.

The output of the model is that the diet is primarily vertebrate (mean +/- sd: 50 +/- 7.2%, range: 34-70%), then plant (35 +/- 6.0%, 17-49%) and finally, arthropod (15 +/- 1.7%, 10-20%). What I am confused by is why the proportion of the arthropods is so low. So many of the consumer's signatures are close to that of the arthropods yet the range of arthropods in the diet is only 10-20%. Shouldn't individuals which closely overlap the arthropod signatures have a higher proportion of arthropods in the diet? Any clarification regarding either the math behind the models or rationale as to why arthropods are consistently low in the diet is much appreciated.

I have attached the isospace plot generated with the data for reference. Isoplot.pdfhttps://github.com/brianstock/MixSIAR/files/4118561/Isoplot.pdf

Thank you in advance for your help!

Koley

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHubhttps://github.com/brianstock/MixSIAR/issues/202?email_source=notifications&email_token=AAZLLMADKZLRJUPFCQUT7PLQ74OALA5CNFSM4KMFQPJ2YY3PNVWWK3TUL52HS4DFUVEXG43VMWVGG33NNVSW45C7NFSM4II74IJQ, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AAZLLMGVQTNJYKKHVTK6XNDQ74OALANCNFSM4KMFQPJQ.

KoleyFreeman commented 4 years ago

Thank you both for your helpful and prompt replies.

In the original model I included individual ID as a fixed effect as per the recommendations in the mixsiar manual Cladocera example. I ran an additional test model with ID as a random effect, and as expected from Brian's comment, the proportions for each individual were essentially equivalent to the mean and had very low variation (range 0.173-0.175). Individual ID as a fixed effect therefore appears to be the best method to move forward with, considering I am interested in looking at individual-level differences in the diet.

As recommended, I also tried running two separate models where the sources were input as either means with inflated sample sizes or duplicated raw. Overall, the output of the three models (the original and the two models with the manipulated source data) were very similar. In the system I am studying, the consumers are generalists so I collected and analyzed hundreds of samples (plants: n= ~50, inverts: n= ~175, vertebrates: n= ~75) so the high uncertainty around their means are not unexpected.

Again, thanks for your assistance!

brianstock / MixSIAR

Math behind MixSIAR #202