Closed JacksonBurns closed 4 months ago
All modified and coverable lines are covered by tests :white_check_mark:
Project coverage is 54.86%. Comparing base (
eed950a
) to head (87d0ae5
). Report is 59 commits behind head on main.
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
Seems that conda
does not explicitly support per-dependency channel specification, but that it actually works anyway: https://github.com/conda/conda-build/issues/532#issuecomment-1777119377
This will be very helpful for this recipe if it turns out to be true
⚠️ One or more regression tests failed. Please download the failed results and run the tests locally or check the log to see why.
beep boop this comment was written by a bot :robot:
⚠️ One or more regression tests failed. Please download the failed results and run the tests locally or check the log to see why.
beep boop this comment was written by a bot :robot:
@mjohnson541 the build tries to install @hwpang's copy of rmgmolecule
during the Julia installation procedure.
it also fails to import RMG, so it is 'seeing' the wrong Python (?) https://github.com/ReactionMechanismGenerator/RMG-Py/actions/runs/8309332599/job/22740568265?pr=2636#step:6:4931
The code I sent was an example of how to have variables load on activation of the environment, not a code to copy paste that would do the variables for this particular case. So what's here won't do the job.
I believe this is roughly the scripting you need:
if [ -f "$ACTIVATE_ENV" ]; then
echo 'python-jl -c "import julia; julia.install()"' >> $ACTIVATE_ENV
echo 'sed -i \'/julia.install/d\' $ACTIVATE_ENV' >> $ACTIVATE_ENV
else
mkdir -p ${PREFIX}/etc/conda/activate.d
touch ${PREFIX}/etc/conda/activate.d/env_vars.sh
echo '#!/bin/sh' >> $ACTIVATE_ENV
echo 'python-jl -c "import julia; julia.install()"' >> $ACTIVATE_ENV
echo 'sed -i \'/julia.install/d\' $ACTIVATE_ENV' >> $ACTIVATE_ENV
fi
No guarantees I've done the escaping perfectly. I'm a little unsure if the sed command will properly delete the line during the first execution, that might require a little bit of experimentation. The build should pass without the sed command added, but it would run julia.install() every time the environment activates, which isn't remotely as long as precompilation second time, but still would like to avoid. There's probably better ways to cause the julia.install() command to only run on the first execution, but this came to mind first.
The problem this fixes is that when you move the conda-binary out of where it was built julia no longer finds its python in the same place nd uses the default which is its own python (and not the python executable, which makes sense from PyCall's perspective even if it doesn't make sense here). Pyjulia's default is to find the julia executable, which it finds, assumes everything is okay and asks for RMS variables, it starts loading RMS and asks it's default python for rmgpy/molecule stuff and when it doesn't find it, tries to install it.
This should cause conda's python to ask PyCall to associate itself with conda's python in the conda binary when the environment loads.
@mjohnson541 thanks for the note. I've added it to the build script without the sed
command for the time being so that we can try it out 'plain'.
⚠️ One or more regression tests failed. Please download the failed results and run the tests locally or check the log to see why.
beep boop this comment was written by a bot :robot:
Sorry, forgot to define ACTIVATE_ENV.
ACTIVATE_ENV="${PREFIX}/etc/conda/activate.d/env_vars.sh"
if [ -f "$ACTIVATE_ENV" ]; then
echo 'python-jl -c "import julia; julia.install()"' >> $ACTIVATE_ENV
echo 'sed -i \'/julia.install/d\' $ACTIVATE_ENV' >> $ACTIVATE_ENV
else
mkdir -p ${PREFIX}/etc/conda/activate.d
touch ${PREFIX}/etc/conda/activate.d/env_vars.sh
echo '#!/bin/sh' >> $ACTIVATE_ENV
echo 'python-jl -c "import julia; julia.install()"' >> $ACTIVATE_ENV
echo 'sed -i \'/julia.install/d\' $ACTIVATE_ENV' >> $ACTIVATE_ENV
fi
⚠️ One or more regression tests failed. Please download the failed results and run the tests locally or check the log to see why.
beep boop this comment was written by a bot :robot:
MacOS build seems to be failling with clang and I can't figure out how to make gcc work - will deal with this later if we can at least get it to work on ubuntu first.
⚠️ One or more regression tests failed. Please download the failed results and run the tests locally or check the log to see why.
beep boop this comment was written by a bot :robot:
@mjohnson541 the build is still trying to download rmgmolecule, but then interestingly still cannot find rmg modules even after that, see build log: https://github.com/ReactionMechanismGenerator/RMG-Py/actions/runs/8329542581/job/22792103206?pr=2636#step:6:4925
Any ideas?
The rush of commits this morning was just figuring out some unrelated conda errors, had to avoid latest version of conda which is apparently bugged.
Okay, so I thought the issue was that you were failing during the binary tests not the build. The activate scripts certainly won't do anything during the build. In theory the build should be simpler to deal with, this implies that rmgpy can't be imported during the build at least by the python linked to julia when it executes, so first we check that rmgpy is available to the python we're trying to use: python -c "from rmgpy.molecule import Molecule"
. We should also check if Julia is linked to the correct python: julia -e "using PyCall; println(PyCall.PYTHONHOME)"
and just to be sure where the current python is: which python
.
I feel like these things would be really quick to debug from a local broken build. I made a couple of attempts to build this this weekend, but all of them failed during cython compilation.
At which point in the build script would those be helpful (where should they be added?). It also seems like conda-build
isn't outputting a 'broken' build anywhere based on what I can see in the logfile. I think it only does that if build.sh
executes but the test step of the overall package build fails.
I have been trying to get the build to run on MacOS runners as well but it seems that clang
isn't behaving. I will try again to force MacOS to use gcc
, which seems to work on ubuntu. Are you using clang
?
Oh...that might be the case. Regardless hard to recover here. I was trying to build on ubuntu, although I did briefly try osx and decided to do ubuntu first.
If we move this to a ReactionMechanismGenerator/RMG-Py branch rather than your own I can just push commits without needing to ask you to make them. That might make this easier.
I have "Allow edits and access to secrets by maintainers" checked so you should be able to push to this branch - if it's not working, I can make an RMG-Py copy of this, just trying to avoid the duplicate push/PR workflow runs.
taking the compiler specification out of the recipe actually seems to have fixed the previous compiler problems, once the runs are completed I can actually check but it seems like we are now correctly using the system compilers rather than any conda ones.
I have "Allow edits and access to secrets by maintainers" checked so you should be able to push to this branch - if it's not working, I can make an RMG-Py copy of this, just trying to avoid the duplicate push/PR workflow runs.
Oh cool! Didn't know you could do that.
Looks like it isn't a julia-python connection issue or a path issue, rmgpy isn't compiling properly.
⚠️ One or more regression tests failed. Please download the failed results and run the tests locally or check the log to see why.
beep boop this comment was written by a bot :robot:
Are we asking the wrong Python to try the given import statement?
+ export PYTHONPATH=/usr/share/miniconda/conda-bld/rmg_1710793746894/work:
+ PYTHONPATH=/usr/share/miniconda/conda-bld/rmg_1710793746894/work:
+ echo 'testing rmgpy'
+ python -c 'from rmgpy.molecule import Molecule'
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/usr/share/miniconda/conda-bld/rmg_1710793746894/work/rmgpy/molecule/__init__.py", line 31, in <module>
from rmgpy.molecule.element import Element, PeriodicSystem, get_element
File "/usr/share/miniconda/conda-bld/rmg_1710793746894/work/rmgpy/molecule/element.py", line 47, in <module>
from rmgpy.quantity import Quantity
File "/usr/share/miniconda/conda-bld/rmg_1710793746894/work/rmgpy/quantity.py", line 43, in <module>
from rmgpy.rmgobject import RMGObject, expand_to_dict
ModuleNotFoundError: No module named 'rmgpy.rmgobject'
The only other thing I can think of is that running make install
(which builds the main code and the solvers at the same time) instead of make
(which runs all
, which builds the main code and then the solvers) is causing the problem, especially since I don't think anyone does the plain make
call these days. We may have introduced some issue somewhere that makes this rule no longer work.
I will edit the build script to try, but I'm not sure it will work.
⚠️ One or more regression tests failed. Please download the failed results and run the tests locally or check the log to see why.
beep boop this comment was written by a bot :robot:
Are we asking the wrong Python to try the given import statement?
+ export PYTHONPATH=/usr/share/miniconda/conda-bld/rmg_1710793746894/work: + PYTHONPATH=/usr/share/miniconda/conda-bld/rmg_1710793746894/work: + echo 'testing rmgpy' + python -c 'from rmgpy.molecule import Molecule' Traceback (most recent call last): File "<string>", line 1, in <module> File "/usr/share/miniconda/conda-bld/rmg_1710793746894/work/rmgpy/molecule/__init__.py", line 31, in <module> from rmgpy.molecule.element import Element, PeriodicSystem, get_element File "/usr/share/miniconda/conda-bld/rmg_1710793746894/work/rmgpy/molecule/element.py", line 47, in <module> from rmgpy.quantity import Quantity File "/usr/share/miniconda/conda-bld/rmg_1710793746894/work/rmgpy/quantity.py", line 43, in <module> from rmgpy.rmgobject import RMGObject, expand_to_dict ModuleNotFoundError: No module named 'rmgpy.rmgobject'
With conda-build stuff it's worth checking everything, however 1) I'm not sure what other python that would be and 2) both the compile and the diagnostic calls use the default python
so that would also suggest the definition of python
would have had to change during the compile.
The only other thing I can think of is that running
make install
(which builds the main code and the solvers at the same time) instead ofmake
(which runsall
, which builds the main code and then the solvers) is causing the problem, especially since I don't think anyone does the plainmake
call these days. We may have introduced some issue somewhere that makes this rule no longer work.I will edit the build script to try, but I'm not sure it will work.
Maybe, I'm afraid my suspicion is that this is down to the compiler setup and cython. Your cython version doesn't immediately strike me as wrong. I'm not particularly literate with the compiler setups. I am wondering a little bit if the # [unix]
you deleted actually does something in the build though jinja or something.
⚠️ One or more regression tests failed. Please download the failed results and run the tests locally or check the log to see why.
beep boop this comment was written by a bot :robot:
I think that particular one might effectively not be doing anything, but it is a thing apparently: https://docs.conda.io/projects/conda-build/en/stable/resources/define-metadata.html#preprocessing-selectors.
⚠️ One or more regression tests failed. Please download the failed results and run the tests locally or check the log to see why.
beep boop this comment was written by a bot :robot:
@mjohnson541 I have consolidated our efforts on this branch into 37bf500 including my efforts from #2641 that fix the plain Python recipe. Going forward on this PR, we should be more readily able to debug what error are arising from the build step with Julia vs Python
Unfortunately I don't think this approach will allow us to build the Mac conda binaries using GitHub actions, since they are apparently still not up to the task of precompiling RMS (see this job failure on the aforementioned PR, where the Python step works but then putting RMS on top of it hangs).
⚠️ One or more regression tests failed. Please download the failed results and run the tests locally or check the log to see why.
beep boop this comment was written by a bot :robot:
Wow, that is a terrible hang! I've never seen Pkg hang like that before. I just relaunched it to check if it was some machine or communications issue.
If that issue is a function of the environment we're building in Julia, one of the quieter benefits of switching to juliacall is that we no longer need diffeqpy and thus DifferentialEquations.jl which is an enormous dependency in the RMG-RMS julia environment (that isn't part of the RMS julia environment), so Pkg should build the environment appreciably faster once that's merged.
Sounds good! If you can get RMG to compile here I'll handle any of the julia related problems.
Yeah that hang is nuts - the CI actually skips tests on Mac and just builds RMG because running build rms
in Julia always hangs (see https://github.com/ReactionMechanismGenerator/RMG-Py/pull/2500). If juliacall
fixes it that would be great, especially since that would also allow us to remove diffeqpy
which we have an unofficial image of anyway (that we want to stop using).
The latest ubuntu run does build rmg without problems, and then in our testing rmg
part of the build script fails: https://github.com/ReactionMechanismGenerator/RMG-Py/actions/runs/8500378693/job/23282167579?pr=2636#step:6:2052
The rerun completed successfully so looks like it just got a bad machine or some server it needed was down when it ran.
I mean if we can't import Molecule then something is clearly wrong with compilation even if it doesn't error during the compilation step.
The actions runs on this PR aren't getting to the RMS build step even with the modification to the environment activation script.
The molecule import works on the pure python PR after installing RMS on top of the conda binary, so our build.sh (here https://github.com/ReactionMechanismGenerator/RMG-Py/blob/685168fa3dea9ecfd43f7dd968a8f96e6123496a/.conda/build.sh) in this PR must be causing the problem. We do change the PYTHON
environment variable and some other things - could be causing this?
Yeah, I would copy whatever you had that worked as exactly as possible here (ignoring/removing anything we added here) plus the debug line trying to import Molecule after compiling. Once we have a setup where RMG compiles properly right I can add anything we need back in.
I've moved the make install
call all the way to the top of build.sh
- in theory, the only difference between this and the pure-python version is that the conda environment during the build will have Julia, pyrms, etc.
We should keep an eye out for the warning that the Julia dependencies cannot be imported, as should be raised by this line: https://github.com/JacksonBurns/RMG-Py/blob/87d0ae50e94e18639032d765b7f15fe97f94dc08/rmgpy/rmg/reactors.py#L52
⚠️ One or more regression tests failed. Please download the failed results and run the tests locally or check the log to see why.
beep boop this comment was written by a bot :robot:
This pull request is being automatically marked as stale because it has not received any interaction in the last 90 days. Please leave a comment if this is still a relevant pull request, otherwise it will automatically be closed in 30 days.
This PR is a combination of effort from https://github.com/ReactionMechanismGenerator/RMG-Py/pull/2539 to fix the conda recipe and advice recieved on https://github.com/ReactionMechanismGenerator/RMG-Py/pull/2631 in this comment: https://github.com/ReactionMechanismGenerator/RMG-Py/pull/2631#issuecomment-1998723914 which may allow fixing the conda recipe without making RMS an optional installation dependency (which was the solution proposed in #2631).
Prior efforts to do this were unsuccessful for some reasons discussed on #2631 in addition to the limited capacity of the GitHub runners - when the actions runners try to build the recipe they run out of memory. Fortunately in the last year they have upgraded the linux runners, so they might be able to do it now. Failing all else, we will at least have one publicly visible attempt at fixing the recipe so that everyone can run it locally.