Open tanelv opened 2 years ago
Hmmm, haven't seen that error before. How large are the files you are trying to process? What OS are you using? I'll try to install using their methods to see if I can reproduce, I usually use the pip install of the python branch of jjhelmus as it seemed to usually be more stable, but confused why the compilers wouldn't have helped.
Also, while I try to see if I can reproduce etc, I recommend possibly opening on issue on their issue tracker as well. Maybe someone else has experienced the issue as well there.
@tanelv I'm wondering why there are two different environments involved (cbc
and wradlib_xr
) in the traceback? If things get picked up from another environment this is usually a big source of problems.
If you can provide any additional details, this would help very much for diagnosing.
Ah good catch @kmuehlbauer ! Yeah I second that as well.
Hm, good points. The files are IRIS RAW files, around 5-10 MB each. I use CentOS 7:
(cbc) [a93859@stage63 ~]$ cat /etc/os-release
NAME="CentOS Linux"
VERSION="7 (Core)"
ID="centos"
ID_LIKE="rhel fedora"
VERSION_ID="7"
PRETTY_NAME="CentOS Linux 7 (Core)"
ANSI_COLOR="0;31"
CPE_NAME="cpe:/o:centos:centos:7"
HOME_URL="https://www.centos.org/"
BUG_REPORT_URL="https://bugs.centos.org/"
CENTOS_MANTISBT_PROJECT="CentOS-7"
CENTOS_MANTISBT_PROJECT_VERSION="7"
REDHAT_SUPPORT_PRODUCT="centos"
REDHAT_SUPPORT_PRODUCT_VERSION="7"
The files should be OK, I used the phase_proc_lp function with CyLP on the same files on an older server (running on Scientific Linux 6.9) successfully, but our university is moving to the new HPC system and I would need to migrate to this new system.
Yes, there are two environments. I first tried to get CyLP working under the wradlib_xr
environment (where there are both wradlib and pyart installations), but as I could not get it working there I decided to try to make a new environment only for pyart (the cbc
env). Actually I managed to get to the same point in wradlib_xr
env. CyLP installation finally succeeded, but the script hangs with the same error. When running the script in wradlib_xr
env, the traceback does not refer to the other env. But if the two environments still might cause troubles, should I delete both and make a new environment and try to install there?
But if the two environments still might cause troubles, should I delete both and make a new environment and try to install there?
Just to be on the safe side. It might not solve the issue, but we would know for sure then.
Sorry for taking so long to answer. I tried to first update the current Anaconda installation, but as it stayed solving the environment for more than 6 hours I stopped it, removed Anaconda completely and installed a new Anaconda from scratch using the current latest version (https://repo.anaconda.com/archive/Anaconda3-2021.11-Linux-x86_64.sh). These are the steps I took:
wget https://repo.anaconda.com/archive/Anaconda3-2021.11-Linux-x86_64.sh
bash Anaconda3-2021.11-Linux-x86_64.sh
conda create -n pyart_py38 python=3.8 arm_pyart coin-or-cbc numba gdal -c conda-forge
conda activate pyart_py38
conda install -c conda-forge pkg-config
pip install cylp
To install CyLP I followed these instructions https://github.com/coin-or/CyLP
And it still hangs with the same error as before (*** Error in `python': free(): invalid pointer: 0x000055a16c1efc68 ***
):
cylp_log2.txt (in the log you can also see all the steps I took starting from creating the new environment)
Sorry for the late response, I'm wondering if trying to older version would work. I'm not familiar with the new CyLP version, so seeing what I can find out about it, but maybe install coincbc with: conda install -c conda-forge coincbc then for cylp: pip install git+https://github.com/jjhelmus/CyLP.git@py3 has worked for our package cmac. I would still raise an issue on the CyLP issue tracker as well.
Thanks for the suggestion. To be safe I removed the old pyart_py38
env and created a new one with the same name. I tried to install using the above suggestions with jjhelmus version, but I get compilation errors, no matter which compiler I use
error: command '/gpfs/space/home/a93859/anaconda3/envs/pyart_py38/bin/x86_64-conda_cos6-linux-gnu-gcc' failed with exit status 1
Full error log here cylp_compile_error_log.txt
I now also raised an issue on the CyLP issue tracker https://github.com/coin-or/CyLP/issues/136
The fork https://github.com/jjhelmus/CyLP/tree/py3 of CyLP has been merged into master
(see coin-or/CyLP#28) and other things have been fixed since then, so I doubt that rolling back to that version will help. Also, coincbc
now just installs coin-or-cbc
(see conda-forge/coin-or-cbc-feedstock#11), so that also shouldn't make a difference. If you can replicate this in stand-alone CyLP (or even better in stand-alone Cbc), I can take a look, but there's not enough information in coin-or/CyLP#136 to even start to debug.
Thanks @tkralphs for the response! Yeah makes sense, I'll keep trying to see if I can reproduce the error. @tanelv Are you able to share one of the files that your using?
Here is one of the files SUR190511130002.zip (IRIS raw)
@zssherman It would be great if we could join forces on this one. I'm interested in getting this working too.
@kmuehlbauer Awesome, yeah that sounds like a great idea to me! I haven't been able to reproduce the specific error yet, but the code is hanging up on these files. So been digging through the code to see. Also have tried not using the multi processing version of the code to try to isolate the problem.
@zssherman my idea is to start from the last working environment, if we could identify such. Then we could increase versions and see which one breaks. Ideally we would set this up using CI in a dedicated branch in our pyart forks. I'll try to get something running, but this might take some time.
@kmuehlbauer Gotcha sound good! So I did try the coincbc conda-forge install with the py3 branch in python3.6 just to try anything, and I was able to run the cylp code. When I updated python and cylp is when I started to hang and any file I tried including the user's file above. The py3 branch of cylp only works for python3.6. Python3.6 is far back, so not sure how useful, but between then and now is when something changed. Whether the current kdp proccesing code doesn't handle the current changes and needs to be updated or something else is causing memory issues. I'm trying to check the coin-or-cbc as well.
@kmuehlbauer The environment i used was:
conda create -n cylp_test -c conda-forge python=3.6 numpy netCDF4 coin-or-cbc scipy matplotlib cython gcc_linux-64 gxx_linux-64
with a development install of pyart and github install of the python 3 branch of cylp
Thanks @zssherman for the Python 3.6 reference. I also managed to get CyLP installed in Python 3.6 and my script now runs as it should. These are the steps I took (I removed the old environment before that)
conda create -n pyart_py36 -c conda-forge python=3.6 numpy netCDF4 scipy matplotlib cython gcc_linux-64 gxx_linux-64 arm_pyart coincbc gdal
conda activate pyart_py36
pip install git+https://github.com/jjhelmus/CyLP.git@py3
@zssherman Just FYI, I've recreated the Python 3.6 environment as suggested. It worked. I've created other environments for Python 3.7 /3.8 and 3.9. It looked promising first, but now nothing works, even the Python 3.6 environment doesn't work. I have to restart from scratch.
I've found those interesting issues over at CyLP, which might be connected:
Also I found that we have to be careful with the cython version and we would need to recreate the cpp in any case.
@kmuehlbauer Sorry for the late response, was on vacation. And makes sense, yeah that is helpful, thanks for finding those! Trying to think how to go about this next because it almost seems like a memory leak issue.
As a side note, we will be having assistance on this soon and will most likely do an overhaul of the kdp processing code.
So it looks like Google has an or-tools
package that has the ability to access the same linear program solvers we use in cylp
.
For example, check out this walkthrough of a mixed-integer programming problem... here is a list of the solvers available:
GLPK_MIXED_INTEGER_PROGRAMMING or GLPK or GLPK_MIP
This package is pip installable, and works with the most recent Python versions
@mgrover1 That's available from within conda-forge (ortools-python
), too.
@mgrover1 That's available from within conda-forge (
ortools-python
), too.
Awesome - yeah, it looks like they have a Simplex option, which is what is currently used...
There is no shortage of Python interfaces to MIP solvers. @mkoeppe recently compiled a nice list of all the options, which it would probably be useful to have somewhere other than a ticket in Sage, but here it is:
https://trac.sagemath.org/ticket/26511#comment:56
I believe or-tools
uses file I/O to pass instances to a stand-alone Cbc solver and also uses pure Python to build the model, so it's going to be much slower than CyLP. I'm not sure if speed is important for you but if not, then or-tools
would probably serve your purpose. python-mip
calls the Cbc library directly through cffi
so it's passing the instance to Cbc in memory, but would still be slower than CyLP because it also builds the model in Python. I realize that you guys have struggled a lot with CyLP and it makes sense to look at alternatives, but just wanted to make you aware of the tradeoffs.
By the way, I'm not sure if you guys saw it, but @mkoeppe and I just finished some major improvements to CyLP and there are now binary wheels for all platforms, dramatically simplifying installation (no need to install Cbc, see here).
Whether you continue with CyLP or not, I'm still interested in tracking down this bug.
@tkralphs thank for your response - as someone who is new to MIP solvers, I appreciate your insight on the Python MIP interfaces and your work on CyLP.
The main reason for looking into alternatives was the requirement to use Python 3.6, which was causing issues with installing the rest of the environment we use for PyART.
That is fantastic news about the improved installation steps! I just tried it out with a Python 3.9 environment, and it worked beautifully. Happy to provide feedback where we can, and again, thanks for all your work with CyLP.
Just to be clear, CyLP works with any version of Python. I am using it with Python 3.10. I think the Python 3.6 "requirement" came from the fact that installing it in 3.6 seemed to overcome the particular bug reported here for some reason, but I think the situation is not at all clear at this point. Some more digging is needed. If someone could try to replicate this issue with the new wheels, that would be helpful. Perhaps that will fix the bug somehow.
Using the new build files, I am still seeing the following when running our example using CyLP
Processing Code:
import numpy as np
import matplotlib.pyplot as plt
import pyart
from pyart.testing import get_test_data
file = get_test_data('095636.mdv')
# perform LP phase processing (this takes a while)
radar = pyart.io.read_mdv(file)
# the next line force only the first sweep to be processed, this
# significantly speeds up the calculation but should be commented out
# in production so that the entire volume is processed
radar = radar.extract_sweeps([0])
phidp, kdp = pyart.correct.phase_proc_lp(radar, 0.0, debug=True)
Error:
Exec time: 0.5900969505310059
Doing 0
python(43345,0x117afc600) malloc: *** error for object 0x7f7fad9c6660: pointer being freed was not allocated
python(43345,0x117afc600) malloc: *** set a breakpoint in malloc_error_break to debug
@tkralphs we are running into the same issue described in https://github.com/coin-or/CyLP/issues/138 I believe...
OK, thanks, I will try to find some time to build a version of CyLP and Cbc with debugging symbols, so that I can see exactly where this error is occurring.
It looks like printing the array returned by CyLP works:
print(solution)
[[ 2.34766422 2.43061544 3.18696968 ... 34.95957546 34.96160985
34.96285309]
...
[ 0.66390759 0.88923667 1.24627207 ... 27.49033571 27.46688439
27.41206319]
[ 0.90856286 1.30960192 1.81392097 ... 32.86643274 32.86900836
32.87058235]
It is a numpy.ndarray
:
<class 'numpy.ndarray'>
and we can take the mean of this array
37.23997104342368
but when we assign some variable to this solution in the function phase_proc_lp
, we run into the malloc
error:
python(63892,0x10f9fa600) malloc: *** error for object 0x7fe6d3f50660: pointer being freed was not allocated
python(63892,0x10f9fa600) malloc: *** set a breakpoint in malloc_error_break to debug
@tkralphs following up - have you had a chance to look at the build error here?
Thanks @zssherman for the Python 3.6 reference. I also managed to get CyLP installed in Python 3.6 and my script now runs as it should. These are the steps I took (I removed the old environment before that)
conda create -n pyart_py36 -c conda-forge python=3.6 numpy netCDF4 scipy matplotlib cython gcc_linux-64 gxx_linux-64 arm_pyart coincbc gdal conda activate pyart_py36 pip install git+https://github.com/jjhelmus/CyLP.git@py3
@tanelv I have the same problem as you, but after following your steps to install CYLP, it reports an error when running the test code:
Processing Code: https://github.com/coin-or/CyLP#modeling-example
Error:
undefined symbol:_ZN17CoinIndexedVectorD2Ev
In addition to using pip install git+https://github.com/jjhelmus/CyLP.git@py3 to install cylp-0.7.4, whether you also do other configuration?
@zssherman I encountered the same problem and rolled back to CYLP-0.7.4 as follows:
conda create -n pyart_py36 -c conda-forge python=3.6 numpy netCDF4 scipy matplotlib cython gcc_linux-64 gxx_linux-64 arm_pyart coincbc gdal
conda activate pyart_py36
pip install git+https://github.com/jjhelmus/CyLP.git@py3
but in cyLP-0.7.4, the most basic function imports reported an error :
So if I want to run this function successfully now :
pyart.correct.phase_proc_lp(radar, 2.0, self_const = 12000.0, low_z=0.0, high_z=53.0, min_phidp=0.01, min_ncp=0.3, min_rhv=0.8, LP_solver='cylp_mp', proc=15)
How should I configure my CyLP and Pyart environments? Looking forward to your reply!
@mole-bai - we are working on replacing the CyLP solver in Py-ART. You can use one of the other solvers (LP_solver = "pyglpk"
or LP_solver = "cvxopt"
)
We apologize that we are not able to support solving this CyLP issue.
After finally managing to successfully install CyLP, using it in phase_proc_lp (pyart.correct.phase_proc_lp(radar, 2.0, self_const = 12000.0, low_z=0.0, high_z=53.0, min_phidp=0.01, min_ncp=0.3, min_rhv=0.8, LP_solver='cylp_mp', proc=15)) does not work. The error seems to be "Error in `python': free(): invalid pointer: 0x00005597c77d6c98"
A long list of messages and memory map is being printed out: cylp_messages.txt And then the script just hangs.
I installed CyLP following these instructions https://github.com/coin-or/CyLP
I tried also installing CyLP following these instructions provided in the Py-ART documentation https://arm-doe.github.io/pyart/setting_up_an_environment.html but unsuccessfully. I got what looked like compiling issues even after installing additional conda compilers. So the original CyLP installation instructions worked, but for some reason the phase_proc_lp function is not working still.