variables with high number of 0s

Joseph020304 commented 5 months ago

Hi Congratulations for the awesom package and the companion robmedExtra.

I am conducting analysis of a wide set of variables. Some of the variables used as mediators have a large number of 0s and when running the robust analysis with those variables I get the error:

Warning: S-estimated scale == 0: Probably exact fit; check your dataWarning: initial estim. 'init' not converged -- will be return()ed basically unchangedError in if (const(t, min(1e-08, mean(t, na.rm = TRUE)/1e+06))) { : valor ausente donde TRUE/FALSE es necesario

The ols boost method works, but obvioulsy variable distribution is far from normal so I prefer not to use it.

Since the ols still works, I was wondering whether there is a way to overcome the error with the robust method. Also, I have not found what is the maximum number of 0s allowed in a mediator variable to avoid the error.

Thanks in advance

aalfons commented 5 months ago

Hi, I'm glad you like the software.

Can you be more specific? How many observations do you have and how many zeros do you have in those mediators? Do you use the mediators one at a time in different analyses, or are multiple mediators entering the model?

In general, I can provide only guesses without a reproducible example.

The computation of the robust method involves an initial subsampling step that is crucial for achieving robustness. If there is a lack of variation on such a subsample (due to too many zeros) then there is a computational issue as reported in the error message. It's impossible for me to tell if this can be overcome without a reproducible example.

Also note that although the OLS works computationally, it's unclear if it gives you a reliable answer in that case. So in that sense, the OLS just obscures the problem, whereas the robust method makes it explicit that there is a problem.

Joseph020304 commented 5 months ago

Thank you Andreas.

Well, so I understand that this initial random subsampling of too much 0s may be the point here. I will avoid using these variables, it does not really make sense. I will try to find out what is the threshold of 0s . Thanks so much for the clarification!

DISCLAIMER: Aquest missatge pot contenir informació confidencial. Si vostè no n'és el destinatari, si us plau, esborri'l i faci'ns-ho saber immediatament a la següent adreça: @.*** Si el destinatari d'aquest missatge no consent la utilització del correu electrònic via Internet i la gravació de missatges, li preguem que ens ho comuniqui immediatament.

DISCLAIMER: Este mensaje puede contener información confidencial. Si usted no es el destinatario del mensaje, por favor bórrelo y notifíquenoslo inmediatamente a la siguiente dirección: @.*** Si el destinatario de este mensaje no consintiera la utilización del correo electrónico vía Internet y la grabación de los mensajes, rogamos lo ponga en nuestro conocimiento de forma inmediata.

DISCLAIMER: Privileged/Confidential Information may be contained in this message. If you are not the addressee indicated in this message you should destroy this message, and notify us immediately to the following address: @.*** If the addressee of this message does not consent to the use of Internet e-mail and message recording, please notify us immediately.

aalfons / robmed

variables with high number of 0s #48