Closed geniusjenny closed 1 year ago
Hey @geniusjenny Thanks for the bug report! Could you please:
Paste the output of running session_info: import session_info session_info.show(html=False)
Try running the adjust using: .adjust(method="cbps") And let us know if this works for you?
Thanks!
Thank you so much for your reply!
-----
balance 0.7.0
pandas 1.4.3
session_info 1.0.0
-----
IPython 7.16.3
jupyter_client 6.1.5
jupyter_core 4.9.2
-----
Python 3.8.16 | packaged by conda-forge | (default, Feb 1 2023, 16:01:55) [GCC 11.3.0]
Linux-4.14.309-231.529.amzn2.x86_64-x86_64-with-glibc2.10
-----
Session information updated at 2023-04-13 20:34
Hey @geniusjenny I'm glad cbps worked!
As for the differences: The default method is ipw (using glm with lasso). You could read about it here: https://import-balance.org/docs/docs/statistical_methods/ipw/ You can read about cbps here: https://import-balance.org/docs/docs/statistical_methods/cbps/ You're also welcome to go over the tutorials for more examples and details of usage: https://import-balance.org/docs/tutorials/
As for the bug with ipw.
It looks like you're missing the libgfortran.so.3 shared library, which is required by the glmnet library. You can install it using the package manager for your Linux distribution. Since you're probably using Amazon Linux 2, you can use the yum package manager to install the required library.
First, open a terminal and update your package manager repositories:
sudo yum update
Then, install the libgfortran package:
sudo yum install libgfortran
After the installation is complete, try running your Python code again. The error should be resolved.
If you still face any issues, you might need to create a symlink for the required libgfortran.so.3 file.
Could you please test the above and see if it solves it for you?
Hey @geniusjenny
Did my last comment helped resolve the issue for you?
I'm closing the issue in the meantime. If it didn't help, please feel free to reopen it with more details on the current status.
Thanks!
Hi @talgalili,
Thank you so much for your help! I have figured out that the problem probably is -- I am not able to install os packages on the instance I am connecting to, even with the commands. So I switched to another platform that serve Jupyter Notebook and the sample code runs smoothly with sample_with_target.adjust(max_de=None)
One follow up problem I am still facing is that when I switched to my own dataset, sample 578k and population 6 million. the function has several warnings For .adjust(method="cbps"):
WARNING (2023-04-17 18:39:29,331) [cbps/cbps (line 579)]: Convergence of bal_loss function has failed due to 'Maximum number of function evaluations has been exceeded.'
INFO (2023-04-17 18:39:29,332) [cbps/cbps (line 597)]: Running GMM optimization
WARNING (2023-04-17 19:01:00,310) [cbps/cbps (line 612)]: Convergence of gmm_loss function with gmm_init start point has failed due to 'Maximum number of function evaluations has been exceeded.'
WARNING (2023-04-17 19:22:22,563) [cbps/cbps (line 630)]: Convergence of gmm_loss function with beta_balance start point has failed due to 'Maximum number of function evaluations has been exceeded.'
INFO (2023-04-17 19:22:23,116) [cbps/cbps (line 726)]: Done cbps function
Is this something I should worry about? Thank you so much!
Hey @geniusjenny Sorry for the late reply.
I think it means that the function wasn't able to fully correct the bias it has seen. I suggest you run on the adjusted object (say it's called x): x.covars().plot() And take a look at how much adjustment you got and if there are features with some big problem.
Good luck :)
Describe the bug
I got OSError: libgfortran.so.3: cannot open shared object file: No such file or directory when I ran sample_with_target.adjust(max_de=None)
Session information
Please run paste here the output of running the following in your notebook/terminal: Already satisfied all the requirement in the overview pages and installed glmnet_python and balance using the sample code
Screenshots
Reproducible example
Please provide us with (any that apply):
Additional context
Add any other context about the problem here that might help us solve it.