Open DeirdreLoughnan opened 2 years ago
@lizzieinvancouver The model ran relatively quickly (less than 2h) but produced 8000 divergent transitions the chains mixed poorly for most parameters.
I will continue to think on it and see if I can figure out where I am going wrong.
@DeirdreLoughnan Usually that many divergences suggests the code is just wrong -- that the model is totally mis-specified. Re-reading above, where you had dimension issues "(dim declared = 1279, dim found = 42558)", I wonder if we should check that issue first -- I would suggest running the original sites within plots test data R code and stan model (you may need to update the Stan from using <- to using =; please push that) to make sure the correction is necessary there also ... otherwise it suggests we need to adapt here to match that old code.
@DeirdreLoughnan I updated the repo with the plotsinsites code, so please pull! The model had divergent transitions (yikes), and the test data looks messed up (I think that's the problem, not the Stan code, but I ran out of time to work on this). I did not need to change the dimensions to get the Stan code to run, so that could be related to the issue in your model.
I'd like to spend a little time trying to code the family model soon (ideally in the next week); then I can hopefully fix the dimensions issue you're having and/or figure out something we might be not thinking of. To help me with that, can you:
Move the synchrony code for this into a 'sync' or 'sandbox' folder in pmm/analyses. In there please put:
Not critical, but could you see if Faith:
Thank you! I know you must be tired of this, but we're making progress.
PS: I should be in Monday afternoon if you want to meet to discuss this.
@lizzieinvancouver I was looking over the code for this model and your old model again and it occurred to me that the issue with my first attempt might have been the data I inputted into the model for famnum. Instead of simply inputting a list of family names from the full dataset (which would be length N), it occurred to me after looking at your R test code, that I should actually be inputting a list of family names for each species (so it would be Nspp in length).
I have pushed my code to the pmm repo that tries this and will post an update of the model output when it is finished running.
@lizzieinvancouver I also checked how well my test data was estimating the intercepts like you asked and it looks great!
@lizzieinvancouver the issue I thought I found yesterday made essentially not difference and the model still produces 7998 divergent transitions. The pairs plot looks very similar to the one above.
@lizzieinvancouver I am sorry to hear that you are having a computer nightmare, I hope you can get it sorted quickly!
I asked Faith if she had code for a nested model and the only example should could think of was in: bayes2020/Projects/Faith/rangesModel bayes2020/Projects/Faith/stan/nointer 3levelwpop force&photo ncp FaithExample.stan
It looks promising! I will take another go at getting the model working early next week.
I am planning to be in the office Monday and Wednesday if you are in and have time to chat!
@DeirdreLoughnan Since I am with Mike for a few days I thought I would work on this, if time allows. Can you please check everything in the sandbox is up to date (aka correct for what you're using now)? It should have test data and code.
Can you also add a real data example? Perhaps for birds or some relatively small clade?
Thank you!
@lizzieinvancouver to clarify, do you want to work with Mike on the family level model we never got working, or on the species level model that I tried to add a grand mean and study level effect to (what I am using now)?
I assumed you mean the model with the grand mean/study, so I did push code/models and a real data example for this today. I had test data for the simple version of the model, but quickly tried to write some test data for a model with a grand mean and study, but it is currently producing a large Rhat and takes a very long time to run. I also wrote example code that focuses on just amphibians (which rungs very quickly) and birds (which still takes a few hours to run).
If this is wrong and you are interested in the family level model, I can work on that as well.
@DeirdreLoughnan I mean working on the family level model we never got working. So no study. Can you please post that (and move whatever you have there to a separate folder)? Thanks.
Oh ok great! I have separated the code and taken a go and setting up some test data and examples using amphibian and aves data.
The models do not run well and produce divergent transitions. Do you want me to keep working on this and try to fix it? Or would it be better for you to look it over and chat with Mike about it first?
Thanks!
the species level model that I tried to add a grand mean and study level effect to (what I am using now)?
@DeirdreLoughnan Thanks for the new files. Just to clarify -- you did get the grand mean model running, right? I would have thought that would be easy. Adding study might be harder.
@lizzieinvancouver I ran the model with the grand mean for the species group data, but for some of the models the Rhat's were too high (aves had an Rhat of 1.17 for example). I was running it for the full dataset, but I was getting some weird warnings on midge this week so I stopped the model run and restarted it to see if that helps.
I can try running it again and see, it does seem to take a long time to run though which might be a sign something is not yet right with it.
@DeirdreLoughnan You should get it running on a small (or test) dataset before running the full model. It should run the same as the regular model unless you have a coding error. Can you point me to the phylogeny model you're using (no grand mean, no study effect) that you believe is working and is your current `main' model? You may have already put it in the pmm repo -- if so, just remind me which files.
Thank you!
@lizzieinvancouver ok, I will keep working on it.
The phylogeny model that I am using currently as the main model is:
pmm/analyses/sync/stan/phylogeny_synchrony_cholesky_geoffmod.stan
@DeirdreLoughnan Okay, I actually made some progress! I just pushed edits, but they need work:
@lizzieinvancouver I have added the code for the family level Stan and associated R code to the repo.
The model runs with a few changes to the model code you linked to. I initially had line 33, where famnum is defined in the data block as famnum[Nspp], as you had sitenum[J] in the original code. But this gave the error message that there was a mismatch in the dimensions declared for famnum (dim declared = 1279, dim found = 42558). So I changed it to be declared as famnum[N] and the model runs.
I am currently running this model for and the estimated leapfrog time is 137s. I will post an update when it is finished running.