braddr / d-tester

Automated testing for github projects.
http://d.puremagic.com/test-results/
11 stars 5 forks source link

SSL protocol issues when fetching the PR #70

Closed wilzbach closed 6 years ago

wilzbach commented 6 years ago
fetching https://github.com/wilzbach/phobos.git enforce-2 (attempt: 1/3)
error: error:1407742E:SSL routines:SSL23_GET_SERVER_HELLO:tlsv1 alert protocol version while accessing https://github.com/wilzbach/phobos.git/info/refs?service=git-upload-pack
fatal: HTTP request failed
fetching https://github.com/wilzbach/phobos.git enforce-2 (attempt: 2/3)
error: error:1407742E:SSL routines:SSL23_GET_SERVER_HELLO:tlsv1 alert protocol version while accessing https://github.com/wilzbach/phobos.git/info/refs?service=git-upload-pack
fatal: HTTP request failed
fetching https://github.com/wilzbach/phobos.git enforce-2 (attempt: 3/3)
error: error:1407742E:SSL routines:SSL23_GET_SERVER_HELLO:tlsv1 alert protocol version while accessing https://github.com/wilzbach/phobos.git/info/refs?service=git-upload-pack
fatal: HTTP request failed

https://auto-tester.puremagic.com/show-run.ghtml?projectid=1&runid=3025521&isPull=true

braddr commented 6 years ago

Yes, github had a problem earlier today. This showed up across the fleet. There's already re-tries, as you can see in the log snippit, but at some point it just has to give up and declare failure. github has already gone back to working and the tests are passing currently.

wilzbach commented 6 years ago

This is happening again:

fetching https://github.com/kinke/dmd.git cpp2079 (attempt: 1/3)
fatal: unable to access 'https://github.com/kinke/dmd.git/': error:1407742E:SSL routines:SSL23_GET_SERVER_HELLO:tlsv1 alert protocol version
fetching https://github.com/kinke/dmd.git cpp2079 (attempt: 2/3)
fatal: unable to access 'https://github.com/kinke/dmd.git/': error:1407742E:SSL routines:SSL23_GET_SERVER_HELLO:tlsv1 alert protocol version
fetching https://github.com/kinke/dmd.git cpp2079 (attempt: 3/3)
fatal: unable to access 'https://github.com/kinke/dmd.git/': error:1407742E:SSL routines:SSL23_GET_SERVER_HELLO:tlsv1 alert protocol version

https://github.com/dlang/dmd/pull/7940

And the status page from GitHub shows

All systems reporting at 100%

https://status.github.com/messages

kinke commented 6 years ago

[I had troubles too when pushing to my GitHub repo earlier.]

braddr commented 6 years ago

Looks like github has disabled older tls protocols, requiring tlsv1.2. Unfortunately freebsd 8.x (and 9.x) don't have a new enough openssl to support newer tls protocols. So, we're roughly in a position where github is forcing our hands in terms of upgrading to newer versions of freebsd. A bigger unfortunately, no one has ever finished making dmd and related packages pass the builds and tests for freebsd 10 and 11. So, dlang is in a bad spot.

https://githubengineering.com/crypto-removal-notice/

I suggest dropping freebsd 8.x support from the master and stable branch testers. If when someone gets the freebsd 10/11 builds and tests passing, they can re-enter the master and stable branches.

Additionally, if someone is really hung-ho, figuring out how to update freebsd 8.x to a modern version of openssl without breaking the rest of the os,

Lastly, I don't see an escape hatch on github to allow continued use of older protocols, but maybe they left one and haven't advertised it.

wilzbach commented 6 years ago

I suggest dropping freebsd 8.x support from the master and stable branch testers. If when someone gets the freebsd 10/11 builds and tests passing, they can re-enter the master and stable branches.

I presume that's the best way to go.

@jmdavis or @dkgroot might know a bit more about the current status of DMD on a modern BSD* OS?

braddr commented 6 years ago

@MartinNowak @WalterBright @andralex

Heads up guys, this is unfortunate but I'm not shocked that it's come to a forcing function.

braddr commented 6 years ago

I think I've got freebsd disabled for master and stable but still enabled for the "DMD upd fbsd" project.

WalterBright commented 6 years ago

We just don't have the resources to support older FreeBSD versions, and them not being compatible with github is just the nail in the coffin.

I propose moving to the oldest FreeBSD that works with github and abandoning the older ones.

braddr commented 6 years ago

We're at the point where there is NO version of freebsd that is supported that DMD works on out of the box. So, today, I yanked it from the all build configurations for the auto-tester except the "DMD upd fbsd". That config is doing test builds with freebsd 10 and 11, but they don't pass.

Someone is going to have to finally dig into and fix the problems. Until then, we no longer support freebsd as a tested platform.

jmdavis commented 6 years ago

Right now, it just looks like only 32-bit is failing. If that's true, can we at least have a 64-bit autotester setup running FreeBSD 11.1 so that PRs don't get merge which break it? Unfortunately, 32-bit is then still screwed for the moment, but that seems better to me than dropping all FreeBSD auto-testing entirely.

Given that I use FreeBSD as my primary platform, I've definitely tried to make sure that the latest version of FreeBSD works with the druntime and Phobos tests (either by fixing problems are pushing others to), but I'm not a compiler dev, so I pretty much never run the dmd test suite, and I don't even notice when that breaks. I also don't run any 32-bit systems, so I don't catch any problems there. Clearly, we as a group need to do a better job making sure that the autotester is able to run the latest FreeBSD and is doing so, otherwise, we're going to be consistently screwed on this.

In any case, clearly, one or more of the compiler devs is going to need to fix dmd so that its test suite passes so that we can properly support 32-bit FreeBSD. But based on the upd fbsd section on the autotester, it looks like we can at least have a 64-bit autotester setup.

WalterBright commented 6 years ago

Thank you, @braddr Looking at the failure to build on FreeBSD 32:

/usr/local/bin/ld: skipping incompatible //usr/lib/libpthread.so when searching for -lpthread
/usr/local/bin/ld: skipping incompatible //usr/lib/libpthread.a when searching for -lpthread
/usr/local/bin/ld: cannot find -lpthread
/usr/local/bin/ld: skipping incompatible //usr/lib/libm.so when searching for -lm
/usr/local/bin/ld: skipping incompatible //usr/lib/libm.a when searching for -lm
/usr/local/bin/ld: cannot find -lm
/usr/local/bin/ld: skipping incompatible /usr/local/lib/gcc6/gcc/x86_64-portbld-freebsd10.3/6.4.0/libgcc.a when searching for -lgcc
/usr/local/bin/ld: skipping incompatible //usr/lib/libgcc.a when searching for -lgcc
/usr/local/bin/ld: cannot find -lgcc
/usr/local/bin/ld: skipping incompatible /usr/local/lib/gcc6/gcc/x86_64-portbld-freebsd10.3/6.4.0/../../../libgcc_s.so when searching for -lgcc_s
/usr/local/bin/ld: skipping incompatible //usr/lib/libgcc_s.so when searching for -lgcc_s
/usr/local/bin/ld: cannot find -lgcc_s
/usr/local/bin/ld: skipping incompatible /lib/libc.so.7 when searching for /lib/libc.so.7
/usr/local/bin/ld: cannot find /lib/libc.so.7
/usr/local/bin/ld: skipping incompatible /usr/lib/libc_nonshared.a when searching for /usr/lib/libc_nonshared.a
/usr/local/bin/ld: cannot find /usr/lib/libc_nonshared.a
/usr/local/bin/ld: skipping incompatible /usr/lib/libssp_nonshared.a when searching for /usr/lib/libssp_nonshared.a
/usr/local/bin/ld: cannot find /usr/lib/libssp_nonshared.a

It looks like the correct system libraries are not installed on the autotester machine.

WalterBright commented 6 years ago

@jmdavis yes, the autotester is passing on FreeBSD 64, no reason to not keep that platform.

braddr commented 6 years ago

https://auto-tester.puremagic.com/platform-history.ghtml?projectid=15&os=FreeBSD_64_64

I has passed the most recent iteration, but fails more often than not. And it doesn't look to be 10.x or 11.x specific. Am I missing something?

WalterBright commented 6 years ago

The random dmd failures are:

/home/ec2-user/sandbox/at-client/master-139146-FreeBSD_64_64/dmd/generated/freebsd/release/64/dmd -conf= -m64 -Irunnable  -fPIC -L-lstdc++  -odgenerated/runnable -ofgenerated/runnable/externmangle_0  runnable/externmangle.d generated/runnable/externmangle.cpp.o 
generated/runnable/externmangle.cpp.o: In function `Test10Dtor(Test10*&)':
externmangle.cpp:(.text+0xd8): undefined reference to `operator delete(void*, unsigned long)'
generated/runnable/externmangle.cpp.o: In function `Expression::dispose(Expression*&)':
externmangle.cpp:(.text+0x332): undefined reference to `operator delete(void*, unsigned long)'
generated/runnable/externmangle.cpp.o: In function `Test38::dispose(Test38*&)':
externmangle.cpp:(.text+0x436): undefined reference to `operator delete(void*, unsigned long)'
cc: error: linker command failed with exit code 1 (use -v to see invocation)

Why that would occur randomly makes no sense to me.

WalterBright commented 6 years ago

The phobos tests fail with:

gmake[1]: *** [posix.mak:348: generated/freebsd/debug/64/unittest/std/traits.o] Killed

a fine, unhelpful message.

WalterBright commented 6 years ago

This looks relevant: https://issues.dlang.org/show_bug.cgi?id=17596

jmdavis commented 6 years ago

This looks relevant: https://issues.dlang.org/show_bug.cgi?id=17596

That will matter when 12.0 comes out but doesn't affect 11.1. It has to do with an API change to the system calls having to do with inodes, since they change them from 32-bit to 64-bit, and those changes weren't done in time for 11. It is on my todo list to try and fix that though, since at the moment, I can't update my TrueOS machines because of it.

braddr commented 6 years ago

The delete related linker problem in the dmd test seems to be a freebsd 10.3 problem. It always fails on 10.3 and never fails for 11.1 (for a sample size of about 25 builds).

WalterBright commented 6 years ago

Should we update the test box to the latest FreeBSD? I'm good with that.

braddr commented 6 years ago

Either I'm misunderstanding your question or you're misunderstanding the state of things.

The two build machines in the "DMD upd fbsd" project are running freebsd 10.3 and 11.1. The several machines that were in the other build fleets were 8.3 and 8.4. The latter became non-functional today. The former aren't able to reliably build and run the tests.

What change are you suggesting?

I can't advise that the 10.3 and 11.1 build hosts be added to the main build fleet until the "DMD upd fbsd" boxes reliably pass the build and test steps. Otherwise all pulls are going to fail and require either manual pulling due to not being able to pass the full set of platforms or fixing the freebsd problems first. The former is really not reasonable. The latter isn't better than fixing them before adding new builders to the master and stable builds.

Do you feel differently?

WalterBright commented 6 years ago

Do you feel differently?

I was responding to "a freebsd 10.3 problem. It always fails on 10.3 " and "That will matter when 12.0 comes out but doesn't affect 11.1. It has to do with an API change to the system calls having to do with inodes, since they change them from 32-bit to 64-bit, and those changes weren't done in time for 11."

To me that suggests: "frak the older FreeBSD versions, and install the latest on the autotester machines."

braddr commented 6 years ago

I'm fine with ignoring 10.3, though seems premature to write off an entire in-production version with as little investigation as has been done so far. For all we know at this point 10.x might be easier to fix support for than 11.x. 12.x isn't released yet, so I don't suggest that being our base platform -- also haven't ever tested it, so unknown status with respect to dmd. That leaves 11.1, which also isn't passing yet.

WalterBright commented 6 years ago

Do you have a suggestion for "It always fails on 10.3" with some sort of linker problem?

braddr commented 6 years ago

Everything I know came from a quick glance at the logs to corrolate failures between platforms. If you want my suggestion, drop freebsd. Not worth the time investment relative to the number of contributors.

Regardless, I'm ready to close this ticket now that the defunct freebsd 8.x platform is removed from the tester fleet. The issues are not related to the auto-tester and keeping it in it's issue tracker mis-places the nature of the problem and the responsibility for owning the resolution. Should 10.x and/or 11.x become functional enough to add into the official build fleet, creating an issue for that would be appropriate.

jmdavis commented 6 years ago

If we're getting random failures on 64-bit 11.1, then we can't enable that yet. If we're getting random failures with 64-bit 10.3, then that doesn't give me warm, fuzzy feelings, but Walter and Andrei have previously stated that we can only afford to support the latest FreeBSD, and that would be 11.1. And if we can have an auto-tester running that, then we'll catch problems that won't be caught otherwise, even if that unfortunately means that 10.3 might have problems. So, if 64-bit 11.1 is running well enough to update the auto-tester for that, then I say go for it, but I don't know how well it's running. The upd fbsd section on the auto-tester just gives the architectures, not the versions, so I don't know which of the boxes are running 10.3 and which are running 11.1.

If you want to close this issue on the grounds that it's for fixing the 8.x boxes, and that's not possible, then I don't care. But I really don't want to see FreeBSD dropped from the auto-tester permanently.

I'd suggest that you make both of the boxes on the upd fbsd section run FreeBSD 11.1 and that we just forget about 10.3. Then we can see how stable 11.1 is and figure out when it's stable enough to re-enable as part of the auto-tester proper.

dkgroot commented 6 years ago

It's not 100% clear which os-patch level you are running with freebsd 11.1 / 10.3. Checkout: https://www.cyberciti.biz/open-source/update-your-openssl-on-freebsd-10-x-11-x-to-fix-vulnerabilities/

It's a pretty simple procedure to regularly run

freebsd-update fetch
freebsd-update install

When this causes the patchlevel to go up, a reboot is required.

freebsd-version -k    # shows kernel patchlevel
freebsd-version -u    # shows userland patchlevel
uname -mrs            # shows kernel patchlevel

After a reboot do run:

pkg update && pkg upgrade

That should solve the openssl issue you are seeing, and fix future issues as well.

More about freebsd updating/upgrading and patchlevels

jmdavis commented 6 years ago

@dkgroot Brad knows full-well how to update FreeBSD. That's not the problem.

The auto-tester has been running FreeBSD 8.x for ages now, and the oldest version of FreeBSD that's supported by the FreeBSD folks is 10.3. Brad has a couple of boxes running 10.3 or 11.1 (it's not clear which box is running what, because they're just marked as 32-bit and 64-bit in the auto-tester interface) listed under the upd fbsd tab: https://auto-tester.puremagic.com/?projectid=15

The problem is that while 8.x has been tested by the auto-tester for ages, so it's reasonably stable, newer versions have generally only been tested by folks locally, so problems have crept in and haven't all been fixed. So, Brad's extra testers running with 10.3 and 11.1 show problems that need to be fixed before the main auto-tester boxes can be updated. But we can't continue to use 8.x, because it's not receiving updates and does not support the version of SSL/TLS that github now insists on.

However, because the main auto-testers have not been running a recent version of FreeBSD, and there hasn't been enough of an effort to make sure that the problems found by the boxes under the upd freebsd have been fixed, Brad can't update the main auto-tester boxes, because the dmd tests will fail.

So, it looks like we're forced to disable FreeBSD on the main auto-tester boxes until the problems found by the more up-to-date auto-tester boxes are fixed.

dkgroot commented 6 years ago

@jmdavis I cannot know what other people know or do not know :-) Regarding the SSL issue (subject), it was not clear which patchlevel was running on these test-slaves. Adding the extra background information about patchlevels could help others (non freebsd users) understand what this was all about.

Deprecating old/eol versions ought be part of the process. Even 10.3 and 10.4 have been declared legacy :-) No need to keep 8.x around.

If there is anything i could do to help, just let me know.

braddr commented 6 years ago

At the top of each build log is a dump of the actual versions of the kernel and a number of key tool chain components. For example:

https://auto-tester.puremagic.com/show-run.ghtml?projectid=15&runid=139501&dataid=967382

==== Toolchain Information ==== uname -a: FreeBSD ip-172-31-5-14 10.3-RELEASE-p24 FreeBSD 10.3-RELEASE-p24 #0: Wed Nov 15 04:57:40 UTC 2017 root@amd64-builder.daemonology.net:/usr/obj/usr/src/sys/GENERIC amd64 MAKE(gmake): GNU Make 4.2.1 Built for amd64-portbld-freebsd10.3 Copyright (C) 1988-2016 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later http://gnu.org/licenses/gpl.html This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Illegal option -- SHELL(/bin/sh): HOST_DMD(/home/ec2-user/sandbox/at-client/release-build/install/freebsd/bin64/dmd): DMD64 D Compiler v2.068.2 Copyright (c) 1999-2015 by Digital Mars written by Walter Bright HOST_CXX(c++): FreeBSD clang version 3.4.1 (tags/RELEASE_34/dot1-final 208032) 20140512 Target: x86_64-unknown-freebsd10.3 Thread model: posix ld: GNU ld (GNU Binutils) 2.28 gdb: /usr/bin/gdb ==== Toolchain Information ====

So, at the moment, ip-172-31-5-14 (35.163.230.63) is a 10.3 host and ip-172-31-15-226 (52.25.51.127) is an 11.1 host. But as they're ec2 spot instances, they are subject to termination and re-creation pretty much at-will. So, check the logs.

braddr commented 6 years ago

The reason I want to close this issue and push the work back into the dlang repos is that there's nothing wrong with the auto-tester itself here (well, beyond 8.x not being able to talk to github, but that's being accepted as a consequence of age). FreeBSD 10.x and 11.x are executing the test intrastructure just fine. It's the system under test that's not passing, and that's a dlang issue, not an auto-tester issue.

At some future date, re-arranging the configuration of the testing of the DMD master and DMD stable configurations will need to be done, and that would be an appropriate ticket for the auto-tester. Working through the fixes required within dmd/druntime/phobos, does not belong here.

That make sense?

braddr commented 6 years ago

Lastly, in terms of supported versions, there's a dual responsibility model here. The auto-tester has to have the appropriate configurations available, but what versions to test and maintain is a dlang decision. If you guys truly want me to add freebsd 11.1 to the dmd master or dmd stable (or any other build config) then I will, but the fall out is on those that made the decision, not me for having performed the config change.

jmdavis commented 6 years ago

Okay. Looking at the 64-bit tester, the weird linker error is always on 10.3, whereas 11.1 either passes or gets killed when running the Phobos tests. It never actually fails the tests. Under what circumstances does the auto-tester kill an auto-tester run? When something gets merged? I think that we need to know why it would be being killed before we enabled 64-bit 11.1 for the main auto-tester. dmd would clearly have to be fixed for 32-bit 11.1 before 32-bit is enabled regardless.

So, while it's up to Walter, I say that we make it so that upd freebsd only runs FreeBSD 11.1 and forget about 10.3, since it's not latest, and whether we update the main 64-it FreeBSD auto-tester to 11.1 and enable it depends on whether the fact that some auto-tester runs are killed indicates a problem or whether it means that the auto-tester decided that there's no point in running it anymore due to a merged PR or some other similarly benign reason. Either way, with the upd freebsd boxes consistently running 11.1, there won't be any confusion over which version of FreeBSD we're dealing with, and any errors there are will probably be less random.

And @braddr, given that you would like to close this issue and have any discussion on updating the auto-tester elsewhere, where would you like such a discussion to be? In e-mails? In the newsgroup? In another issue specifically for updating the auto-tester rather than one for an SSL problem? Somewhere else that I haven't thought of?

braddr commented 6 years ago

The auto-tester only has a high level timeout (defaults to an hour if I remember right) which will kill the run. It will abort tests between steps of the process if the run has been deleted / marked obsolete, but that never is considered a failed build, that wouldn't make any sense.

The only other reasons would be the os killing processes or things like oom (both testers have 4 gigs of memory, so not likely) or segfault or whatever, none of which are the auto-tester or it's configuration.

I'm not suggesting that issues about the auto-tester be discussed elsewhere. I'm suggesting that since there aren't issues with the auto-tester, but rather with dmd/druntime/phobos on freebsd, that those issues belong in the appropriate dlang issue tracker. I consider this the same sort of issue to dmd taking more memory after some change and causing oom failures. The problem isn't the test infrastructure, it's dmd or the code being compiled. You wouldn't track those issues as auto-tester issues.

WalterBright commented 6 years ago

If you guys truly want me to add freebsd 11.1 to the dmd master or dmd stable (or any other build config) then I will,

Yes, please.

but the fall out is on those that made the decision, not me for having performed the config change.

Feel free to blame me.

braddr commented 6 years ago

Apologies, but one last confirmation, you're aware that it doesn't reliably pass right now, right? So pulls are going to fail left and right and that becomes everyone's problem.

JinShil commented 6 years ago

Please don't break the CI. We need everything passing to do our work.

jmdavis commented 6 years ago

32-bit doesn't pass at all, so we clearly can't enable that. It's not clear to me how often 64-bit will fail, because the failure is random, and it's due to it being killed for who-knows-why, not due to any tests actually failing, and the upd freebsd boxes have been running a mixture of 10.3 and 11.1, and 10.3 fails due to completely different reasons.

I definitely think that the upd freebsd boxes should be running 11.1 and only 11.1 so that the situation with 11.1 is clearer, but pulling the trigger on 64-bit 11.1 on the main auto-tester might be premature, much as we clearly need to get to the point that we can sooner rather than later. But either way, we definitely shouldn't be enabling 32-bit yet.

WalterBright commented 6 years ago

We can't find the problems unless it is failing. So yes, confirm. As for everyone's problem, this is has been a problem that's been interminably ignored for a long time, because it could be ignored. Time to make it a problem that can't be ignored.

WalterBright commented 6 years ago

Also, I don't care about 10.3.

braddr commented 6 years ago

Done. two 11.1 build hosts, no 10.3. Enabled for "dmd d2 master" and "dmd d2 stable" for both 32 and 64 bit.

"dmd upd fbsd" is the same two build hosts now as well.

braddr commented 6 years ago

for what it's worth, the 32 bit build failure looks like it might be a missing -m$(MODEL) when building gen_man. Why that would only show up on the fbsd 32 bit build is unclear. I've only looked at the build logs so it might be obvious from the makefile.

jmdavis commented 6 years ago

From reading the log, it looks like the 32-bit box may be a 64-bit boxes building for 32-bit. Maybe the problem relates to that?

braddr commented 6 years ago

Yes, freebsd 10+ now supports a mixed 32/64 bit environment, like linux has done forever. And is being used that way on these hosts.

WalterBright commented 6 years ago

I've been looking at dirent.d https://github.com/dlang/druntime/blob/master/src/core/sys/posix/dirent.d#L152

And comparing it with freebsd dirent.h https://github.com/freebsd/freebsd/blob/master/sys/sys/dirent.h#L66

which it does not match at all, however, it matches https://github.com/freebsd/freebsd/blob/master/sys/sys/dirent.h#L83

Anyone know what is going on?

jmdavis commented 6 years ago

It may relate to the inode size changes for 12.0, since they had to play some games to avoid breaking backwards compatibility (some games that don't automatically work for us, since we don't use C headers). I'd strongly suggest that you look at the stable branch of FreeBSD and not CURRENT / master. I think that this is the correct branch:

https://github.com/freebsd/freebsd/tree/stable/11

jmdavis commented 6 years ago

And that branch has

https://github.com/freebsd/freebsd/blob/stable/11/sys/sys/dirent.h#L50

WalterBright commented 6 years ago

This would explain why 10.3 is failing - the declarations in druntime match the 11 declarations.

braddr commented 6 years ago

What? Druntime matches 8 and 11, so presumably it also matched everything in between. It's only 12 that's changed that struct's layout. How's that explain anything about the 10.3 failure?

WalterBright commented 6 years ago

See https://github.com/freebsd/freebsd/blob/master/sys/sys/dirent.h#L83 where it says:

#if defined(_WANT_FREEBSD11_DIRENT) || defined(_KERNEL)
struct freebsd11_dirent {

implying it is different from before 11, and 11 is the one we match.

jmdavis commented 6 years ago

I think that it's freebsd11_dirent, because that's what FreeBSD 11 uses, whereas FreeBSD CURRENT (what will be 12) uses a different struct, because they changed inodes to be 64-bit instead of 32-bit, and they couldn't simply change the struct without breaking code.

FreeBSD 8 pretty much has to have the same version of the struct that we have in druntime, because that's what the auto-tester has been using for ages. And given all of the effort the FreeBSD devs went to to avoid breaking backwards compatibility with the inode changes, I don't think that these structs have changed much in a very long time.