Closed mlubin closed 10 years ago
it's only in a source compile, but google agrees that your processor is ivy bridge, so I don't think it is a haswell bug
According to a vague comment in the libuv source, I have discovered there are such things as "non-IFS LSPs", which seem to be defined as "LSPs which are not IFS" in most of my google results, and which could cause this failure.
Anyways, libuv aborts at startup if any of these are detected in the network stack, due to a firewall, local packet sniffer, virus / malware, or corruption (since they can cause lost or delayed data, in Vista onwards). The following article seems to describe how to detect these intruders: http://support.microsoft.com/kb/811259
Can anyone with the problem confirm that this is / is not a problem on their machine?
edit: link fixed
Dead link
On Thu, Jun 12, 2014 at 11:02 PM, Jameson Nash notifications@github.com wrote:
"LSPs which are not IFS"
I don't know exactly what I'm doing here, but I did find this: http://support.microsoft.com/kb/2568167 and in it there is a command line instruction "netsh Winsock Show Catalog". Executing it in a command window reveals "Winsock Catalog Provider Entry" (entries) which contain a field denoted as "Service Flags:". According to the referenced document, if the service flags contain 0x20000, the LSP is IFS. If the most significant bit is cleared, it is non IFS. All entries received when I ran the command had service flags which contained 0x20000. I'm assuming this means the LSPs in my network stack play by the rules.
David L. Livingston, Ph.D., P.E.
Design Engineer/Consultant Livingston Embedded Computing, LLC d.livingston@ieee.org, 540-520-1848
Professor of Electrical and Computer Engineering Virginia Military Institute livingstondl@vmi.edu, 540-464-7545
"100% of the shots not taken don't go in." The Great Gretzsky, ice hockey player "Without deviation from the norm, progress is not possible." Frank Zappa, musician, composer and social satirist "Complexity breeds fragility. Fragility breeds surprises. Surprises are bad." Bob Colwell, computer engineer and Pentium architect
Or this ancient, unsolved version, which sounds so nearly identical: http://www.itlisting.org/5-windows/228b667d9634aa62.aspx
I updated to Windows 8.1 last weekend. Using the same installer as before, I can no longer reproduce the issue.
Other than upgrading, I had to create a user profile from scratch, though I ALSO had to copy over the C:\Users\Default from a Windows 7 machine, after having the upgrade corrupt that profile, making creation of functional new users impossible.
The issue also doesn't exist in a secondary profile I had migrated through the upgrade, but I cannot tell if the problem would have been reproducible there before.
Can anyone try if the problem persists after
I just realized that the environments shown above have a PATH variable near the windows maximum length of somewhere between 1024 and 32768 (likely numbers also include 1920, 2047, 8191)
Is there any hope of dealing with this soon?
actually, maybe.
the Windows documentation mentions that you shouldn't call getenv from DllMain. allocating memory, and spawning threads is also advised against. mostly the side-effect will be occasional deadlock, but it also mentions that some of the windows APIs will attempt to access uninitialized memory (specifically Advapi32, for interacting with the register, among other things). libopenblas does many of these things, and removing DllMain also fixes this bug.
edit: ref http://msdn.microsoft.com/en-us/library/windows/desktop/dn633971(v=vs.85).aspx
using the following gist, kakobrekla was able to prove that this bug is in openblas: https://gist.github.com/vtjnash/bfbdfe55915557f0d691
edit: link to results http://dpaste.com/26MM927.txt
What a nightmare. You and kakobrekla deserve a medal. I'm still curious what changed in Win8 to trigger this.
(I'm running builds with clean-openblas
- should be sufficient to apply the patch, I think. So next binaries should include this)
Wow, amazing. One of the most inscrutable and heavily-discussed bugs of all.
Something's not quite right yet with the fix, just tried the latest binaries posted half an hour ago:
Warning: error initializing module LinAlg:
ErrorException("ccall: could not find function gotoblas_init in library libopenblas")
_
_ _ _(_)_ | A fresh approach to technical computing
(_) | (_) (_) | Documentation: http://docs.julialang.org
_ _ _| |_ __ _ | Type "help()" to list help topics
| | | | | | |/ _` | |
| | |_| | | | (_| | | Version 0.3.0-prerelease+3836 (2014-06-22 03:25 UTC)
_/ |\__'_|_|_|\__'_| | Commit 0d98451 (0 days old master)
|__/ | x86_64-w64-mingw32
Should have been distclean-openblas or rm -r openblas-v0.2.9 to apply the patch. @ihnorton
This bugfix is totally crazy! Who could have thought that BLAS will screw DNS.
@vtjnash yeah seems okay on a local build, no warning and I do see gotoblas_init
exported from libopenblas.dll. If you're going to clean-openblas
, may as well distclean-openblas
too. Download time is a lot quicker than rebuild time for openblas.
New binaries are up, rebuilt after removing the openblas directory. Message is gone.
Thanks, but you forgot the LLVM patch!
I distcleaned llvm a couple days ago for that :/ Doing a from-scratch build now.
Have these changes made it into the nightly builds? I downloaded the src and did a clean build and I'm still seeing the issue.
Just downloaded and installed the 64-bit prerelease. Ran Pkg.init() (after deleting old history files) and no change in behavior. Still getting the "...could not resolve..." errors.
On Mon, Jun 23, 2014 at 1:21 PM, Willy notifications@github.com wrote:
Have these changes made it into the nightly builds? I downloaded the src and did a clean build and I'm still seeing the issue.
— Reply to this email directly or view it on GitHub https://github.com/JuliaLang/julia/issues/5574#issuecomment-46874685.
David L. Livingston, Ph.D., P.E.
Design Engineer/Consultant Livingston Embedded Computing, LLC d.livingston@ieee.org, 540-520-1848
Professor of Electrical and Computer Engineering Virginia Military Institute livingstondl@vmi.edu, 540-464-7545
"100% of the shots not taken don't go in." The Great Gretzsky, ice hockey player "Without deviation from the norm, progress is not possible." Frank Zappa, musician, composer and social satirist "Complexity breeds fragility. Fragility breeds surprises. Surprises are bad." Bob Colwell, computer engineer and Pentium architect
:(
Find me on IRC in the evening if you can help run more tests
We need a "craziest bugs" discussion tomorrow!
Julia 0.3 is now fully operational on my corporate issued Windows 8 laptop. With gratitude to all who helped solve this, thank you.
Is this fix in the nightlies? I'll ask the original user who had this issue to try it out.
The one from yesterday is; the latest commit Jameson just pushed is not, but will be in an hour or so.
@kakobrekla @wheineman @drlivip @flashus please confirm this bug is also fixed for you (and not broken again by my latest commit -- it looks like ihnorton has updated the binaries on http://status.julialang.org/ as promised)
I've been following this and can confirm the bug is now fixed for me! Great work!
@vtjnash confirmed. Please include it in 0.2.1.
Still working, still happy....
julia> versioninfo() Julia Version 0.3.0-prerelease+3911 Commit 19582f7* (2014-06-27 16:05 UTC) Platform Info: System: Windows (x86_64-w64-mingw32) CPU: Intel(R) Core(TM) i7-4600M CPU @ 2.90GHz WORD_SIZE: 64 BLAS: libopenblas (USE64BITINT DYNAMIC_ARCH NO_AFFINITY) LAPACK: libopenblas LIBM: libopenlibm
Yes, the bugs have been squashed. Thank you very much for your efforts.
Dave
On Sat, Jun 28, 2014 at 5:32 AM, Willy notifications@github.com wrote:
Still working, still happy....
julia> versioninfo() Julia Version 0.3.0-prerelease+3911 Commit 19582f7 https://github.com/JuliaLang/julia/commit/19582f7* (2014-06-27 16:05 UTC) Platform Info: System: Windows (x86_64-w64-mingw32) CPU: Intel(R) Core(TM) i7-4600M CPU @ 2.90GHz WORD_SIZE: 64 BLAS: libopenblas (USE64BITINT DYNAMIC_ARCH NO_AFFINITY) LAPACK: libopenblas LIBM: libopenlibm
— Reply to this email directly or view it on GitHub https://github.com/JuliaLang/julia/issues/5574#issuecomment-47422883.
David L. Livingston, Ph.D., P.E.
Design Engineer/Consultant Livingston Embedded Computing, LLC d.livingston@ieee.org, 540-520-1848
Professor of Electrical and Computer Engineering Virginia Military Institute livingstondl@vmi.edu, 540-464-7545
"100% of the shots not taken don't go in." The Great Gretzsky, ice hockey player "Without deviation from the norm, progress is not possible." Frank Zappa, musician, composer and social satirist "Complexity breeds fragility. Fragility breeds surprises. Surprises are bad." Bob Colwell, computer engineer and Pentium architect
Just for some closure, the user who originally had this issue reports that it's resolved. Thanks to everyone who helped solve this!
I'm helping out a user who's experiencing a very strange issue (Windows 8, 64-bit):
But pinging github's IP address from within julia works.
The strange part is that if you run the same command in the Git bash that comes with Julia, it works fine:
This is an issue with all DNS lookups, not just github. There are no firewalls enabled (that I can find). The same occurs with julia 0.2 and 0.3. This happens on all networks, not just MIT wifi. The internet connection seems to work fine from all applications except julia. How can I debug this?