Recommended N is systematically higher for networks with negative edges.

axrhart commented 1 year ago

Dear Mihai,

we want to use your package to support our sample size considerations in an upcoming proposal. While playing around with different options for the possible network, we noticed that matrices with an increasing number of negative edges tend to produce a remarkably higher sample size recommendation than positive networks. For example, flipping the sign of all edges on an all positive matrix results in an increase in recommended sample size by +162.

Code to reproduce:

# Params
seed <- 2022
nodes <- 10
density <- 0.4
positive <- 0.9
min_N <- 50
max_N <- 1000

# Set seed
set.seed(seed)

# Generate model
model_matrix <- powerly::generate_model("ggm",
                                        nodes = nodes,
                                        density = density,
                                        positive = positive)

# Flip signs
model_matrix_neg <- model_matrix * -1

# Compute results
results <- powerly::powerly(min_N,
                            max_N,
                            model_matrix = model_matrix)

# Method run completed (18.6493 sec):
# - converged: yes
# - iterations: 1
# - recommendation: 302

results_neg <- powerly::powerly(min_N,
                            max_N,
                            model_matrix = model_matrix_neg)

# Method run completed (35.4334 sec):
# - converged: yes
# - iterations: 2
# - recommendation: 464

I think I am using the CRAN version of powerly.

Unfortunately, I have not much experience with network theory and I had some difficulties wrapping my head around your code without some class diagrams to guide my exploration. Is this the expected behavior that matrices with negative edges require more participants to reach the same sensitivity? Or am I doing something wrong?

Best, Alex

P.S: Thanks for your effort developing this package. So far, the workflow and plots have been really enjoyable/helpful and it was super interesting reading some object-oriented R code. 😊

mihaiconstantin commented 1 year ago

Dear Alex,

I am happy to hear that you are using powerly for your sample size planning.

I ran the code you provided (i.e., using powerly version 1.8.6 on CRAN) and I obtained results in line with what you indicated for both true models selected, i.e.:

The recommendation for the positive scenario was $301$ (i.e., see below).

The recommendation for the negative scenario was, indeed, higher, namely $475$ (i.e., see below).

I believe this behavior is to be expected and may be related to suppression effects (i.e., maybe this post helps). In fact, if you take a look at Figure $6$ in our manuscript, which has now been accepted for publication at Psychological Methods, you see a similar case. More specifically, in our simulation setup, true models one and six were identical in terms of the edge weights values used, but differed in terms of the number of positive edges, with model one having 90% positive edges and model six having 50%. As you can see in the results plot below, after $6000$ method runs (i.e., powerly calls), the average recommended sample size was indeed larger for the model with more negative edges (i.e., roughly $563$) compared to the model with fewer negative edges (i.e., around $526$).

2022-10-06-13-45-24

P.S. Thank you for the feedback regarding the package and the workflow! We are working on a project where we introduce an API for powerly to enable others to easily run sample size calculations for different kinds of models and custom performance measures. We chose SEM as an example to illustrate this API. I will make sure to include a UML diagram in the package documentation at powerly.dev.

mihaiconstantin commented 1 year ago

@axrhart, I will mark this as completed. If you have further questions on the topic, please feel free to reopen or reach out!

mihaiconstantin / powerly

Recommended N is systematically higher for networks with negative edges. #34