looking at the simData function I found a possible bug in the selection of sample parameters (beta_s) from the reference dataset.
On line 347 of simData function, simulation parameters of the current cluster(k) and sample pair(s) are selected in the bs_ks object, but for both elements of the sample pair the parameters of the first sample are used (bs$sample_id[[s0[1]]]).
This implies that counts of the two samples of the current pair are both simulated using the beta_s parameter of the first sample, correct me if I'm wrong.
To overcome this limit I simply split the bs_ks object into 2 different objects, one for each element of the sample pair:
bs_ks1 <- cbind(b0, bs$cluster_id[[k0]], bs$sample_id[[s0[1]]])bs_ks2 <- cbind(b0, bs$cluster_id[[k0]], bs$sample_id[[s0[2]]])
and then used these objects in the following code lines.
I attach the edited simData function (edited lines are marked with comments), let me know if this change make sense.
simData_debug.R.zip
Hi Muscat team,
looking at the simData function I found a possible bug in the selection of sample parameters (beta_s) from the reference dataset. On line 347 of simData function, simulation parameters of the current cluster(k) and sample pair(s) are selected in the bs_ks object, but for both elements of the sample pair the parameters of the first sample are used (
bs$sample_id[[s0[1]]]
). This implies that counts of the two samples of the current pair are both simulated using the beta_s parameter of the first sample, correct me if I'm wrong.To overcome this limit I simply split the bs_ks object into 2 different objects, one for each element of the sample pair:
bs_ks1 <- cbind(b0, bs$cluster_id[[k0]], bs$sample_id[[s0[1]]])
bs_ks2 <- cbind(b0, bs$cluster_id[[k0]], bs$sample_id[[s0[2]]])
and then used these objects in the following code lines.I attach the edited simData function (edited lines are marked with comments), let me know if this change make sense. simData_debug.R.zip
Thanks, Simone