Closed zym1010 closed 8 years ago
Hi! This is the friendly automated conda-forge-linting service.
I just wanted to let you know that I linted all conda-recipes in your PR (recipe
) and found it was in an excellent condition.
@zym1010 please bump the build number in the meta.yaml
file to get a new build.
@ocefpaf Done!
Also, if you are interested in this recipe, consider adding yourself in the maintainers list :wink:
So, we had initially disabled threading as there is a lot of bad behavior in OpenBLAS w.r.t. and it has several fun bugs as a consequence.
See these two comments and related xref'd issue for details.
Thanks. But I think those issues are fixed right? Also, without threading, it's basically useless...
I used to compile OpenBLAS myself under Ubuntu 14.04 with default settings and they worked without a problem.
Or, if really necessary, open a new fork of multithreaded OpenBLAS... I switched to conda-forge instead of Anaconda because sometimes they make crappy numpy and scipy packages for various reasons, yet their OpenBLAS is multithreaded. I don't want to lose the benefit of multithreaded OpenBLAS.
@jakirkham I checked those issues you mentioned. Can't they be avoided by setting environment variable at runtime?
Thanks. But I think those issues are fixed right?
As @njsmith says in his comment, they added a workaround. The problem is how they are scheduling threading and that has not changed. For instance, I'm looking at this issue ( https://github.com/xianyi/OpenBLAS/issues/900 ) and it looks like a variant on the exact same problem that he experienced.
@jakirkham I see. But can they be avoided by setting flags to restrict threading at runtime, rather than falling back to single-thread for everyone when building? Since I think most people won't encounter such issue...
Just to weigh in, given that the user can set the behaviour both via an environment variable or a call into OpenBLAS (in C) it seems sensible that we would build with threading and let the user avoid their own pathological cases?
@njsmith, do you have any idea as to the frequency that sched_yield
problems occur for the user? In other words, how easy is it to land in a pathological threading case?
Well, I did it once, and then they fixed that case. So no, not really. But presumably if it were happening to people every day (and in obvious ways), then they would be getting bug reports about it constantly, which doesn't seem to be the case.
I guess another data point is that the numpy wheels are using openblas with threading enabled for a few months now, and we haven't gotten any complaints.
Thanks for the data points and thanks for the proposal @zym1010. Given this seems to be working fine for a large number of users, I'm more then happy to enable it.
Appears the whole stack linking to OpenBLAS directly needs to be rebuilt. Unfortunately I missed this before merging. Sorry about that. Perhaps future BLAS changes should be enhancement proposals.
Trying to help move the stack forward to avoid brokenness. Though I can really use others help. Here is what I have done so far. If you depend on NumPy, you should be fine to proceed. If you depend on SciPy, please wait for the deployment of its rebuild you should be fine. Really this is basically done at this point.
cc @conda-forge/core @conda-forge/numpy @conda-forge/scipy @conda-forge/scikit-learn @conda-forge/suitesparse @conda-forge/bob-math @conda-forge/hmat-oss @conda-forge/openturns @conda-forge/coincbc @conda-forge/caffe @conda-forge/cvxopt @conda-forge/simbody
Should add that if any builds are getting running into timeouts or getting segmentation faults. It may be that it is assuming to many threads on the CI. As we know 1 thread has worked in the past for these feedstocks (how they were built originally), please trying setting OPENBLAS_NUM_THREADS
to 1
before building or testing so as to avoid using too many threads.
A question came up in another thread about whether this breaks NumPy extensions that don't use a BLAS.
I don't think so. It seems only things linking to the BLAS need to be rebuilt. This ends up being a much shorter list. That being said, I'm not thrilled with this breakage either. As a test, I have tried a couple numpy
extensions (rank_filter
, pyfftw
) and they seem fine without a rebuild. Would be happy to test (or help others test) more things.
cc @moorepants @ocefpaf
Since old packages that were built with older openblas do not work with this new openblas build, we need to hotfix all the old ones to make sure they don't get installed with this new openblas.
So as we have not decided as a community to go through with hotfixing ATM. See discussion in this PR ( https://github.com/conda-forge/conda-forge.github.io/pull/170 ), I'm not really sure we can do that until we agree on hotfixing as a strategy. The old packages shouldn't be installed by default as the newer build number of OpenBLAS and the rebuilt packages is preferred. Thus by default the user gets a working environment once all PRs are merged.
However, I'm aware that you do like to use your old packages, @183amir. Pinning the build number would make this possible and this would be doable if hotfixing were permissible. Though it is not. I can see two other options.
1 seems totally opposite to what you are trying to accomplish even though it "fixes" the problem. 2 could provide a legitimate fix that would work for you without trying to use hotfixing. If 2 is permissible, we should discuss how that relabeling is happening and to what.
This is a bit off topic, but I could see value in stamping packages with labels based on date or some similar granularity. This way people wanting old packages could pull everything in from that date without changing the packages themselves. Maybe worth a bullet point in PR ( https://github.com/conda-forge/conda-forge.github.io/pull/170 )'s overview of strategies.
It seems the Travis CI OS X workers are saturated. Not seeing anything building on our end (though it is not 100% obvious from the web interface for me so feel free to correct me). Maybe there is an issue they need to fix. This is probably going to hold up resolution of this issue.
🎉 SciPy has been rebuilt! 🎉
A question came up in another thread about whether this breaks NumPy extensions that don't use a BLAS.
I don't think so.
You cannot know that. Some extensions may depend on numpy
and openblas
but have only numpy specified as dependency while it try to auto-discover the "blas
" that numpy
brought with it if any. (And note that is is fair as packaging BTW!)
Luckily it seems that scikit-learn
had openblas
specified and you found it. We will only know if there are others if people report breakages.
Some extensions may depend on
numpy
andopenblas
but have only numpy specified as dependency while it try to auto-discover the "blas
" thatnumpy
brought with it if any.
It is possible there are packages linking to the BLAS and not specifying it. However, this is no longer a BLAS specific problem. This is a general packaging problem on our end that needs addressing. Namely packages should list the dependencies they link to. Not doing so seems to have caused the issues we experienced with Qt. Though I agree that we should remain vigilant for similar issues.
Luckily it seems that
scikit-learn
hadopenblas
specified and you found it. We will only know if there are others if people report breakages.
There was no luck involved. I investigated this when packaging scikit-learn
and made sure that it was packaged appropriately.
So basically we are down to openturns
(of things we know of). The PR for it works. Simply a matter of merging it and getting it built at this point. Everything else has been rebuilt.
As mentioned by @ocefpaf, there could be packages that we did not find depending on OpenBLAS that do. Keep your eyes open for loading issues with OpenBLAS. It may indicate such a package is involved. Please let us know so we can help you fix it.
Thanks everyone for your help.
openturns
was rebuilt earlier today. Considering this resolved.
fix threading.