QuentinAndre / pyprocessmacro

A Python library for moderation, mediation and conditional process analysis.
MIT License
92 stars 28 forks source link

Analyzing a 3 column pandas data frame #11

Closed Foadsf closed 2 years ago

Foadsf commented 4 years ago

Dear Quentin,

First of all, thank you for creating this library. Apparently this Hayes method is very popular among the humanities and I had no idea how to transfer the algorithms from SPSS/R to Python. Thank FSM that you have already done it :)

I'm trying to follow this example where there are just 3 variables: independent, outcome and moderator. The data can be downloaded from here and the CSV file is in ⁨hayes2018data⁩/disaster⁩/disaster⁩.csv.

                      

I looked among the examples you have provided in the README.md and I can't find an example with only three variables. My attempt to ignore all other parameters m, z, model also caused an error:

ValueError: The variables supplied do not match the definition of Model 3 Expected variable(s) not supplied: m

I would appreciate it if you could help me understand what is the problem and how I can resolve it. Thanks in advance and looking forward to hearing back.

Best, Foad

Foadsf commented 4 years ago

From this page (i.e., shiny app made by Keon-Woong Moon, The R package for PROCESS processR) I assume I should set the model parameter to 1? But I get the error:

ValueError: The variables supplied do not match the definition of Model 1 Expected variable(s) not supplied: mVariable(s) supplied not supported by the model: w

Foadsf commented 4 years ago

Ok, I think I have figured it out. The correct format should be:

p = Process(data=df, model=1, x="FRAME", y="JUSTIFY", m="SKEPTIC")
Foadsf commented 4 years ago

Now I can't figure out how the plotting functions plot_conditional_direct_effects and plot_conditional_indirect_effects work! :(

QuentinAndre commented 4 years ago

Hi Foadsf,

The functions for the conditional direct effects and conditional indirect effects only make sense for mediation models. Since you are using a moderation model (model 1), those functions are not available. Unless I am missing something...?

Foadsf commented 4 years ago

@QuentinAndre

Thanks for the reply. I'm actually not an expert on this topic as I'm doing some data analysis for my wife, for her project in Psychology :) The dataset I have been given has serval semi-continuous variables (e.g., age, self-compassion, anxiety ...), and some discrete variables (e.g., gender, education...) The reason I selected model1 is just because it is the easiest to begin with. Here is the header of the data frame in CSV format:

,sc,age,hads,edu,mhc,gender,nl,status
0,49.0,19.0,19.0,0,74.0,1,66.0,1
1,37.0,20.0,27.0,0,34.0,0,4.0,0
2,31.0,20.0,24.0,0,32.0,0,3.0,0
3,30.0,21.0,21.0,0,27.0,0,4.0,1
4,34.0,22.0,26.0,1,24.0,0,18.0,0
5,38.0,22.0,25.0,1,38.0,1,14.0,0
6,31.0,23.0,20.0,0,14.0,0,5.0,0
7,33.0,23.0,16.0,0,30.0,0,17.0,0
8,32.0,24.0,18.0,0,56.0,1,16.0,0
9,52.0,24.0,20.0,0,59.0,0,96.0,1
10,26.0,24.0,32.0,1,33.0,1,60.0,0
11,52.0,26.0,29.0,1,60.0,1,12.0,0
12,53.0,26.0,13.0,1,81.0,0,60.0,1
13,34.0,26.0,12.0,1,50.0,1,54.0,1

for example, we need to know the correlation between sc and mhc while age is a moderator/mediator (I'm not sure about the difference!). What is the best model for that? I would appreciate if you could help me understand the method as well.

I have also been tinkering with the Keon-Woong Moon's R implementation of the methodology processR with little success so far (good tutorials here, here , and here and documentation).

QuentinAndre commented 4 years ago

Hi,

Mediators and moderators are conceptually different, and have very different meanings. In Model 1, you are assuming that age is a moderator: That is, the effect of SC on MHC might be different for people who are younger vs. older.

You first need to check for the presence of a significant interaction effect between Age and SC. If you have one, you can then inspect the predicted relationship between SC and MHC at different values of Age by using the function spotlight_direct_effect(). I refer you to the documentation for additional details.

Foadsf commented 4 years ago

@QuentinAndre Thanks for the Eli5 explanations. :)

We do actually see a noticeable difference in sc-mhc correlation for people below and above 30. I did use conventional tools NumPy/pandas/matplotlib to visualize the significance. Would you be kind to provide a minimum viable example for spotlight_direct_effect() using the above CSV data?