Model Reproduction & Simulation Studies

xinyi030 commented 1 year ago

Description:

This issue involves:

Model Reproduction: Reproduce the results in Figures 6 & 7 by fitting a logistic regression model as described in the paper.
Simulation Studies: Run simulation studies for the ssHHT trial using the CRM with logistic regression as outlined in the project description.

Responsibilities:

Assignees will:

Model Reproduction: Execute the reproduction of the model from the paper, ensuring results align with Figures 6 & 7.
Simulation Studies: Conduct the outlined simulation studies, adhering to the specified parameters and conditions.
Team Collaboration: Regularly update the team on progress and collaborate as needed, particularly with those working on the Methodology and Results & Discussion sections.

Assignees: Wenqian, Gabriel, Bowei, Zikai

kangbw702 commented 1 year ago

Hi Team,

I tried reproducting Figures 6 & 7 by fitting a two-parameter logistic regression model. The code can be found under Model Reproduction section. The output figures were also uploaded.

The final MTD and the mean predicted p(DLT) curve were well replicated. But the confidence interval looks different. The reason is the original research used very different inference approach (Bayesian CRM and Likelihood-based CRM) with what I did here (simple logistic regression).

Could you please check my code? And any ideas on improving the CI part?

Best, Bowei

xinyi030 commented 1 year ago

Hi All,

Just a quick reminder that you are supposed to clarify your plan for this issue by the end of May 13th. Please write down specific task assignment (at this time) and your estimated finish time.

@kangbw702 @tzkli @Sirius2713 @GabeNicholson

Thanks, Xinyi

tzkli commented 1 year ago

Hi all,

The paper cited uses a Bayesian approach with a prior, unlike the frequentist logit regression that @kangbw702 attempted. I've reproduced Figure 6b using a canned implementation of Bayesian CRM. I've pushed my changes to the repo. The current Figure 7 is also off - this one uses a likelihood-based estimation procedure so the code Bowei uses should be mostly correct, but CIs are off. I think the CI calculation is different from the standard logistic regression but I don't know how. I can have a look at the original paper.

Best, Zikai

kangbw702 commented 1 year ago

Hi all,

The paper cited uses a Bayesian approach with a prior, unlike the frequentist logit regression that @kangbw702 attempted. I've reproduced Figure 6b using a canned implementation of Bayesian CRM. I've pushed my changes to the repo. The current Figure 7 is also off - this one uses a likelihood-based estimation procedure so the code Bowei uses should be mostly correct, but CIs are off. I think the CI calculation is different from the standard logistic regression but I don't know how. I can have a look at the original paper.

Best, Zikai

Hi Zikai,

Your results for Fig.6 using bcrm make more sense. My previous derivation assumes observation independency but they are actually correlated (dose of a new round of experiment depends on the current one). For Fig.7 likelihood-based method, I will check the paper too.

Best, Bowei

kangbw702 commented 1 year ago

Hi Team,

I replicated Fig.7b using package dfcrm. Please see updated code and plot.

Best, Bowei

xinyi030 commented 1 year ago

Hi Team,

Just wondering how's the project going? The team would want to get your results and then start the analysis part. 😊 Please let me know if you have some plans/questions. Thanks!

Best, Xinyi

kangbw702 commented 1 year ago

Hi @GabeNicholson @Sirius2713,

I am not sure if you have completed the simulation part. Since you are using Python, I drafted a simulation study using R as a backup. Please see the uploaded rmd file. Please let me know if you have any suggestions.

Best, Bowei

Sirius2713 commented 1 year ago

Hi Bowei,

Thanks for the help. We’ve actually tried in both r and python. We have already got some results and just need to check on some details. You can find them in the research_simulation branch.

Best, Wenqian

On Fri, May 19, 2023 at 5:07 PM BOWEI KANG @.***> wrote:

Hi @GabeNicholson https://github.com/GabeNicholson @Sirius2713 https://github.com/Sirius2713,

I am not sure if you have completed the simulation part. Since you are using Python, I drafted a simulation study using R as a backup. Please see the uploaded rmd file. Please let me know if you have any suggestions.

Best, Bowei

— Reply to this email directly, view it on GitHub https://github.com/xinyi030/PHS43010_NonStatGroup/issues/4#issuecomment-1555312768, or unsubscribe https://github.com/notifications/unsubscribe-auth/ASV33PKZANVTNOGFK6TSOV3XG7VJHANCNFSM6AAAAAAX6YFK2Y . You are receiving this because you were mentioned.Message ID: @.***>

xinyi030 commented 1 year ago

Thanks for your great work, team!

I'll try to give your work a careful look and get back to you by tomorrow.

@tzkli @kangbw702 I was trying to work through yourcrm_sim_v1.rmd but somehow I couldn't successfully knit it. Could you give me a knitted version?

@Sirius2713 @GabeNicholson Could you provide an update on the results you've obtained so far? Have you managed to gather all the necessary data and findings? If not, what are the remaining pieces that you still need to work on? Your insights will be greatly appreciated.

Looking forward to your responses.

Best, Xinyi

Sirius2713 commented 1 year ago

Hi xinyi,

You can check the research_simulation branch for our progress. We’ll merge it to the main when finalizing it.

Best, Wenqian

On Fri, May 19, 2023 at 5:44 PM Xinyi Zhang @.***> wrote:

Thanks for your great work, team!

I'll try to give your work a careful look and get back to you by tomorrow.

@tzkli https://github.com/tzkli @kangbw702 https://github.com/kangbw702 I was trying to work through your crm_sim_v1.rmd but somehow I couldn't successfully knit it. Could you give me a knitted version?

@Sirius2713 https://github.com/Sirius2713 @GabeNicholson https://github.com/GabeNicholson Could you provide an update on the results you've obtained so far? Have you managed to gather all the necessary data and findings? If not, what are the remaining pieces that you still need to work on? Your insights will be greatly appreciated.

Looking forward to your responses.

Best, Xinyi

— Reply to this email directly, view it on GitHub https://github.com/xinyi030/PHS43010_NonStatGroup/issues/4#issuecomment-1555347344, or unsubscribe https://github.com/notifications/unsubscribe-auth/ASV33PINHXEOOZN4ZV3DLNLXG7ZVPANCNFSM6AAAAAAX6YFK2Y . You are receiving this because you were mentioned.Message ID: @.***>

GabeNicholson commented 1 year ago

I'm just running the python version to validate the code for the R program since the R program uses a different prior and we want to make sure it is not too different. But the python code takes a long time to run since it uses a computationally heavy algorithm.

tzkli commented 1 year ago

Hi Gabe, thanks for the update. I was double-checking crm_sim_v1.rmd and there was a mistake in the model (it uses a one-param logit instead of a two-param logit). I'm fixing it. We can then use this to further validate the results.

HongzhangXie commented 1 year ago

Hi, team, thanks for the checking and upload the crm_sim_v2.rmd codes. The code is great! I am in the conclusion group and planning to write some discussion for the codes to answer the Research and Question 1&2. I try to run the code but find there is a error report "Error in t[, "dose"] : incorrect number of dimensions" in multiple runs - case 1. Do I make some mistakes? @tzkli @kangbw702

GabeNicholson commented 1 year ago

Me and @Sirius2713 will be testing out some ideas for the bonus question (3). If anyone has ML experience and wants to contribute you're welcome to do so.

Sirius2713 commented 1 year ago

Hi @tzkli @kangbw702 , the bcrm function in R is actually using a slightly different form of tow-parameter logistic function. The priors you used in your codes are not not appropriate due to that. Also, the sdose is different from dose. It's the dose label based on the skeletons, beta1 and beta2. I've uploaded a new version for the simulation now.

Sirius2713 commented 1 year ago

Hi team, after me and @GabeNicholson working on the simulation parts, we found that the model performs better under scenario 2. The results file are also uploaded as in "bcrm_results.txt". We are now moving forward to the bonus question. Would appreciate if anyone has any idea to mitigate the difference under two scenarios, i.e. improve the performance under scenario 1.

xinyi030 commented 1 year ago

Thanks for the update, Wenqian! 😊

kangbw702 commented 1 year ago

Hi Team,

Thank you for revising my crm_sim_v1.rmd. For the bonus question, in my CRM rule settings (In each cohort, if prop.tox <= 1/6, upgrade; if 1/6 < prop.tox <= 1/3, keep; if prop.tox > 1/3, downgrade), I found adding extra dose levels that are lower than the previous lowest dose (0.5) may help to mitigate the difference between two scenarios. Consider a standard liear regression Y=a+bX+e. If the range of X is too small, then the standard error of the estimated b will be inflated. Note, se(b_hat) = (X'X)^(-1)*sigma^2, where sigma^2 = Var(e). With the same logic, in the CRM design, if increasing the range of attained doses, we get better estimate of model parameter and hence the predicted p(tox). I validated this idea in crm_sim_v1.rmd but not sure if this result is sensitive to different model settings. @GabeNicholson, @Sirius2713, maybe you can try this under your simulation framework.

Best, Bowei

GabeNicholson commented 1 year ago

merged our branch to main.

GabeNicholson commented 1 year ago

Hi Team,

Thank you for revising my crm_sim_v1.rmd. For the bonus question, in my CRM rule settings (In each cohort, if prop.tox <= 1/6, upgrade; if 1/6 < prop.tox <= 1/3, keep; if prop.tox > 1/3, downgrade), I found adding extra dose levels that are lower than the previous lowest dose (0.5) may help to mitigate the difference between two scenarios. Consider a standard liear regression Y=a+bX+e. If the range of X is too small, then the standard error of the estimated b will be inflated. Note, se(b_hat) = (X'X)^(-1)*sigma^2, where sigma^2 = Var(e). With the same logic, in the CRM design, if increasing the range of attained doses, we get better estimate of model parameter and hence the predicted p(tox). I validated this idea in crm_sim_v1.rmd but not sure if this result is sensitive to different model settings. @GabeNicholson, @Sirius2713, maybe you can try this under your simulation framework.

Best, Bowei

If I'm understanding your results correctly, the performance from scenario 2 decreased from 62% to 57% with the new approach? I suppose that does mitigate it by worsening the results but I think we should try to increase the accuracy rather than decrease it.

kangbw702 commented 1 year ago

Hi Team, Thank you for revising my crm_sim_v1.rmd. For the bonus question, in my CRM rule settings (In each cohort, if prop.tox <= 1/6, upgrade; if 1/6 < prop.tox <= 1/3, keep; if prop.tox > 1/3, downgrade), I found adding extra dose levels that are lower than the previous lowest dose (0.5) may help to mitigate the difference between two scenarios. Consider a standard liear regression Y=a+bX+e. If the range of X is too small, then the standard error of the estimated b will be inflated. Note, se(b_hat) = (X'X)^(-1)*sigma^2, where sigma^2 = Var(e). With the same logic, in the CRM design, if increasing the range of attained doses, we get better estimate of model parameter and hence the predicted p(tox). I validated this idea in crm_sim_v1.rmd but not sure if this result is sensitive to different model settings. @GabeNicholson, @Sirius2713, maybe you can try this under your simulation framework. Best, Bowei

If I'm understanding your results correctly, the performance from scenario 2 decreased from 62% to 57% with the new approach? I suppose that does mitigate it by worsening the results but I think we should try to increase the accuracy rather than decrease it.

You are right. For scenario 1, if we started from a lower level, the accuracy improves, which may due to that the distribution of dose visit frequencies is more speard out and centered at true MTD (The approach before adjustment shows a more skewed distribution). But for scenario 2, this distribution is already good before the adjustment. And adding more extra lower levels may reduce efficiency. Is it possible to decide whether to add extra lower levels depending on some preliminary knowledge or trial interium results such that like only scenarios with small potential MTD is adjusted, like scenario 1?

GabeNicholson commented 1 year ago

You are right. For scenario 1, if we started from a lower level, the accuracy improves, which may due to that the distribution of dose visit frequencies is more speard out and centered at true MTD (The approach before adjustment shows a more skewed distribution). But for scenario 2, this distribution is already good before the adjustment. And adding more extra lower levels may reduce efficiency. Is it possible to decide whether to add extra lower levels depending on some preliminary knowledge or trial interium results such that like only scenarios with small potential MTD is adjusted, like scenario 1?

I like the idea but I'm not sure in what cases you could do that apriori unless you had special insight. I was thinking that if we wanted to be more aggressive, we could reweight the samples are model learns to prioritize faster parameter learning. For example, after 15 samples to train on, our model will predict how many toxic events will occur for the next 3 individuals. If it predicts correctly (say it predicts 1/3 of the individuals will have a toxic event and indeed that does occur) then we can take an aggressive strategy and give those samples more weight (let's say 1.5x). And then we continue on until the model gets it very and then we retrain the model on the previous samples and forget the weighting. The strategy here is that we want our model to learn fast with only a small number of samples. So we are going to bet that when the model predicts correctly, it is probably on the right track for tuning the two beta's that approximate the true unknown probabilities—so we are going to push it in that direction. The hope here is that it will extract more information and will be able to discriminate between 0.25 and 0.3 more easily (at least on average) over many simulations.

I'm going to code this up later and see if it works but I'm open to any suggestions.

tzkli commented 1 year ago

Hi @GabeNicholson and @Sirius2713 , could you say a bit more about the priors being wrong? in crm_sim_v2.rmd, I'm using a normal prior with no correlation between the two params. In your R code, you seem to be using a more informative normal prior. Since a normal prior on the parameters seems to be the standard for when the model is two-param, I think this shouldn't be a problem? The original paper uses an exponential prior but that's probably infeasible for a two-param specification.

I've pushed some changes and two figures from the case 1 simnulation to the repo.

Sirius2713 commented 1 year ago

Hi @tzkli , the two-parameter logistic in the "bcrm" uses log(\alpha) as the intercept and \beta as the slope. But our paper uses \alpha as the intercept and exp(\beta) as the slope. Therefore, to specify the prior of the intercept mean as 0 and the slope mean as 1. We need to make the prior mean as (1, 1) instead of (0, 0). It would be problematic for the function if we use (0, 0) here.

tzkli commented 1 year ago

Hi @Sirius2713 , thanks for pointing this out! I've corrected the .rmd file. Will upload the new figures from comparion with yours later.

GabeNicholson commented 1 year ago

Okay, I've completed the code for my proposed approach (gabe_bonus_proof_of_concept.ipynb). It does look like there is some hope for the reweighting approach. I used a non-bayesian likelihood approach (sklearn's logistic regression) to test it out for speed reasons. My original simulations with the Sklearn logistic regression approach gives an accuracy score of 48.3% and 62% for both scenarios, respectively.

With the reweighting approach, I'm able to bring up the first scenario's accuracy to 52.3%. It still is not much but that's more than what I hoped for and can serve as a potential answer depending on if we come up with something better...

tzkli commented 1 year ago

The discrepancy I got for the two scenarios is even starker... (23% vs. 76%, see crm_sim_v3.html). It was either the difference in seeds or priors or both. For the first scenario the difference in probabilities between the first two dose levels is small. I think the result might improve further if we allow for a different escalation strategy. Also, should we experiment with different priors for the logit model params?

GabeNicholson commented 1 year ago

crm_sim_v3

What strategy are you trying to reduce the discrepancy? Or are you talking about question 1 and question 2? Because we have those already.

Sirius2713 commented 1 year ago

The discrepancy I got for the two scenarios is even starker... (23% vs. 76%, see crm_sim_v3.html). It was either the difference in seeds or priors or both. For the first scenario the difference in probabilities between the first two dose levels is small. I think the result might improve further if we allow for a different escalation strategy. Also, should we experiment with different priors for the logit model params?

Hi @tzkli , maybe you can try to use "median" instead of "mean" when estimating the posteriors and sdose.calculate. Also, "sdose" in the "bcrm" function is not the original dose level. It's the dose label mentioned in the paper.

GabeNicholson commented 1 year ago

Also looking back at my reweighting strategy I think it is too noisy and is not good enough for practical use. I think a better approach would be to find a way to quantify our model's uncertainty in its predictions so that we can estimate how uncertain we are in our selected dose. This uncertainty should be strongest when we have only a few samples at that selected dose. Then we could use this uncertainty level to hedge our predictions over the simulations and see how often we get the correct dose when our uncertainty is below a certain threshold. This accuracy will definitely be higher and could reach a level around 60%.

Edit: I found a working solution that shrinks the difference to less than 2% while increasing accuracy. The result can be found in my notebook.

tzkli commented 1 year ago

Hi @Sirius2713, I corrected the use of sdose vs dose in crm_sim_v3. (I don't use sdose as an input any more in v3). I wasn't clear in my post before. My point was: have we tried tinkering the priors and other model specifications?

Is the R file in the main folder your most recent code for generating the figures?

kangbw702 commented 1 year ago

Hi @Sirius2713, I corrected the use of sdose vs dose in crm_sim_v3. (I don't use sdose as an input any more in v3). I wasn't clear in my post before. My point was: have we tried tinkering the priors and other model specifications?

Is the R file in the main folder your most recent code for generating the figures?

Hi @tzkli,

The crm_sim_v3 file is dealing with the simulation rather than the bonus question, right? When running bcrm, you used priors list(4, c(1, 1), rbind(c(1, 0), c(0, 1))), and set no dose-skipping constraint. We may need to see if the result are sensitive to these settings.

Best, Bowei

Sirius2713 commented 1 year ago

Hi @Sirius2713, I corrected the use of sdose vs dose in crm_sim_v3. (I don't use sdose as an input any more in v3). I wasn't clear in my post before. My point was: have we tried tinkering the priors and other model specifications?

Is the R file in the main folder your most recent code for generating the figures?

Hi @tzkli , I tried different specifications about prior means and variances. But I don't think the results are sensitive to them. But I do find improvement by changing mean estimation to median estimation.

tzkli commented 1 year ago

Hi all, thanks for the follow-ups. I'm effectively using crm_sim_v3 to do some tinkering to get at the bonus question because we already have the figures for questions (1) and (2). I tried including a loss function this time, and the discrepancy has narrowed (0.667 vs. 0.625). This is sort of cheating (since the loss function is partly tailored to the configurations of the true probs). If we don't have anything better, perpahsp we can use this?

tzkli commented 1 year ago

Please refer to the v3 html file for the most recent results.

kangbw702 commented 1 year ago

Hi Team,

I revisited my previous idea for bonus question - adding extra dose levels. The simulation results can be found in crm_sim_v4_add_lower_dose.rmd. In brief, for scenario 1, the accuracy will increase from 39% to 46%. And scenario 2's accuracy keeps at 67%. The idea is using the first 1/3 patients as kinda training group to evaluate the dose visit frequency. If the freq is highly skewed and concentrated in level 1, an extra lower (level 0) dose will be added. This design mainly affects scenario 1. Although a big gap is still there between two scenarios, the difference can be further mitigated by tunning parameters like fraction of training samples, the number of extra levels added, etc. I think current simultion results demonstrate the idea well. Is it possible to add one slide for this result, for example, as an alternative idea? I can also use 1 minute to briefly present the slide tomorrow. Thanks!

Best, Bowei

GabeNicholson commented 1 year ago

Hi all, thanks for the follow-ups. I'm effectively using crm_sim_v3 to do some tinkering to get at the bonus question because we already have the figures for questions (1) and (2). I tried including a loss function this time, and the discrepancy has narrowed (0.667 vs. 0.625). This is sort of cheating (since the loss function is partly tailored to the configurations of the true probs). If we don't have anything better, perpahsp we can use this?

@tzkli I have a solution that converged the results to less than a 2% difference. Also, we are reducing the discrepancy by increasing the accuracy of scenario 1 not by decreasing scenario 2. So a penalty term probably won't work.

If you would like we can talk more about it on a call.

GabeNicholson commented 1 year ago

Hi Team,

I revisited my previous idea for bonus question - adding extra dose levels. The simulation results can be found in crm_sim_v4_add_lower_dose.rmd. In brief, for scenario 1, the accuracy will increase from 39% to 46%. And scenario 2's accuracy keeps at 67%. The idea is using the first 1/3 patients as kinda training group to evaluate the dose visit frequency. If the freq is highly skewed and concentrated in level 1, an extra lower (level 0) dose will be added. This design mainly affects scenario 1. Although a big gap is still there between two scenarios, the difference can be further mitigated by tunning parameters like fraction of training samples, the number of extra levels added, etc. I think current simultion results demonstrate the idea well. Is it possible to add one slide for this result, for example, as an alternative idea? I can also use 1 minute to briefly present the slide tomorrow. Thanks!

Best, Bowei

Do you have graphs as to why this works? I'm surprised biasing it away from over-sampling at level 1 is optimal since level 1 is the true dose for Scenario 1? Also, the difference is still very large. I managed to get the difference to less than 3% by only analyzing simulations where the proportion of times the selected dose was sampled is less than 90%. Basically, situations where the model gets stuck sampling a single dose for almost all 36 patients. Based on this I would suggest trying to force a new dose at 3/4 of the samples since 1/3 sounds too early to start biasing it?

Also, I tested your simulation and for my run, scenario 2's accuracy increased to 70% and scenario 1's accuracy was 47%.

kangbw702 commented 1 year ago

I explain the idea in slide 19. Think of a linear regression case, you would like to predict Y|X=1. The prediction will be more accurate for X in range [0,2] than in [0.9,1.1]. If you only have data at X=1 (even a huge number), the standard error for prediction will be infinite. In other words, it is not necessarily optimal when oversampling at level 1. Dose range also matters.

Your idea is fantastic and can be presented as the main approach to address Q3.

It is a good idea to get better result by tunning parameters like when to add an extra dose level. For illustration, I used 1/3 kinda arbitraily.

Are 70% and 47% accuracies after adding extra dose?

Best, Bowei

GabeNicholson commented 1 year ago

I explain the idea in slide 19. Think of a linear regression case, you would like to predict Y|X=1. The prediction will be more accurate for X in range [0,2] than in [0.9,1.1]. If you only have data at X=1 (even a huge number), the standard error for prediction will be infinite. In other words, it is not necessarily optimal when oversampling at level 1. Dose range also matters.

Your idea is fantastic and can be presented as the main approach to address Q3.

It is a good idea to get better result by tunning parameters like when to add an extra dose level. For illustration, I used 1/3 kinda arbitraily.

Are 70% and 47% accuracies after adding extra dose?

Best, Bowei

I just ran your notebook but changed the seed from 1 to 2 to see what would happen and then I read the selected mtd output text at the end of the cell. Your motivation is very creative but I think you may be mistaking an inference task for a predictive one since you do not need low standard error estimates to make good predictions. In fact, the most accurate models that win Kaggle ML competitions have a massive amount of collinearity in their predictors (which means inflated S.Es) but still make the best predictions. There is still some debate about why this works so it is not obvious. This also does not mean ur method can not work, but if it does, I would be surprised if it was due to lowering the S.E. If I had to guess, I bet your method would be working because it is forcing simulations that would have only sampled one dose now sample more than one dose. That is one reason why S1 does worse than S2 which I touch on in my slides. You should try making the cutoff later on though, instead of 1/3 make it at 3/4 of the sample because 1/3 is still too early to tell if they will get stuck sampling one dose for the remainder of the simulation.

kangbw702 commented 1 year ago

I explain the idea in slide 19. Think of a linear regression case, you would like to predict Y|X=1. The prediction will be more accurate for X in range [0,2] than in [0.9,1.1]. If you only have data at X=1 (even a huge number), the standard error for prediction will be infinite. In other words, it is not necessarily optimal when oversampling at level 1. Dose range also matters. Your idea is fantastic and can be presented as the main approach to address Q3. It is a good idea to get better result by tunning parameters like when to add an extra dose level. For illustration, I used 1/3 kinda arbitraily. Are 70% and 47% accuracies after adding extra dose? Best, Bowei

I just ran your notebook but changed the seed from 1 to 2 to see what would happen and then I read the selected mtd output text at the end of the cell. Your motivation is very creative but I think you may be mistaking an inference task for a predictive one since you do not need low standard error estimates to make good predictions. In fact, the most accurate models that win Kaggle ML competitions have a massive amount of collinearity in their predictors (which means inflated S.Es) but still make the best predictions. There is still some debate about why this works so it is not obvious. This also does not mean ur method can not work, but if it does, I would be surprised if it was due to lowering the S.E.

I agree the prediction and estimation are two missions, and best prediction model does not necessarily do good inference. But the se inflation in an overfitting model (like the ML models you mentioned) is due to highly correlated covariates. This issue will not harm prediction accuracy. But the se inflation caused by small X variation do harm both estimation and prediction. Actually, in a basic linear regression setting, X'X term will also appear in the se for prediction.

Will try making the cutoff later. But this may leave the true extreme cases too few chances to adjust.

kangbw702 commented 1 year ago

@GabeNicholson ,

No intervention, acc_1 = 39.0, acc_2 = 67.0 Evaluate at 1/3, acc_1 = 46.4, acc_2 = 67.6 Evaluate at 2/3, acc_1 = 42.2, acc_2 = 68.0 Evaluate at 3/4, acc_1 = 43.6, acc_2 = 66.6

It looks the accuracy is not sensitive to the choice when to add extra level.

Sirius2713 commented 1 year ago

Hi @kangbw702 , I noticed that when you add a lower dose (0.25mg), you assume the toxicity is half of 0.5 mg. I might miss something here, but is there any justification behind this interpolation? Because we're assuming the dose-toxicity characteristics are logistic instead of linear.

kangbw702 commented 1 year ago

Hi @kangbw702 , I noticed that when you add a lower dose (0.25mg), you assume the toxicity is half of 0.5 mg. I might miss something here, but is there any justification behind this interpolation? Because we're assuming the dose-toxicity characteristics are logistic instead of linear.

The true ptox is chosen arbitrarily. The only constraint I used is that ptox is smaller at level 0 than that at level 1. It is an assumption that dose and ptox are associated following a logit model. And this is not necessarily the true relationship. We can definitely give different ptox value at dose = 0 and test the sensitivity.

xinyi030 commented 1 year ago

Hi Team,

I appreciate your fantastic presentation today.

I've just posted the most recent version of report (NonStat.Rmd), which includes the Introduction, Literature review, Methodology parts and will be finalized by tomorrow.

Please review it and, by May 26th, noon, add your results in the sections "Model Reproduction", "Simulation Studies", "Results and Discussion". Include all relevant information, including your code, figures, required analysis, and conclusions. To make the report more organized, you can add subtitles. Also include your name so we can credit you.

Feel free to let me know if you have any questions. Thanks.

Best, Xinyi

xinyi030 commented 1 year ago

@tzkli Per our discussion on WhatsApp, I’ve knitted the latest version of our report according to Hongzhang and your edits. Here it is: Final Project_ Bayesian Inference Clinical Trials and Nonparametric Models(1).pdf

It seems that there are still some format issues on your part (the equation), but I didn't manage to solve this problem for you, unfortunately.

In case you would like to try, please edit and knit the 'NonStat_v5.Rmd' and 'BibFile_v5.bib', and directly edit the 'NonStat_v5.Rmd', and then upload a 'NonStat_v6.Rmd', 'NonStat_v6.pdf' which includes your changes by 9:30pm CT. And I will upload that version on Canvas.

Let me know your thoughts. Thanks!

Best, Xinyi

xinyi030 / PHS43010_NonStatGroup

Model Reproduction & Simulation Studies #4