Closed ocaballeror closed 6 years ago
Also, it turns out many tests were failing due to timeouts because we were trying to run 8 parallel threads when the appveyor machine has only 2 cores. Using pytest -n2
works much better :+1:
@ocaballeror Yeah, we could even avoid using pytest-xdist
in AppVeyor. Historically it has given some problems in Travis, but at least it sped things up. If we are not going to notice the speed-up much, then we might as well avoid using it.
I'd say we could even, for AppVeyor, run tests in 1 thread (i.e.: no pytest-xdist
) and only against the latest Python version.
@Peque I've run it a few times with 2 threads and it seems to work fine. We could leave it with 2 and revert to single-threaded mode if we run into any problems in the future. After all, if we have extra cores, why not use them?
About the python versions, though, why test only the latest one? There could be some implementation differences from version to version that suddenly break compatibility, so I'd leave it as it is right now just to be safe.
Update: Still working on the linger tests. There must be something that Windows is doing differently in the internal implementation of sockets and the SO_LINGER option, but I'm having a hard time figuring out what it is.
I'll keep updating when I find more stuff.
Took me a long while to figure it out. In the end it turns out the problem with the linger tests had nothing to do with sockets, zmq or anything related to that. The problem was actually being caused by the way we changed the linger value in every test. We were updating the global config
dictionary, and then expecting the .close_all()
method to read the updated value, but that won't work on Windows because of the way it creates new processes. On Unix the fork()
system called is used to create the agent processes, but since that is not available on Windows, and the new processes are created using spawn()
, we can't expect the global variable config
to be copied to child processes.
In the end, the call to get_linger()
inside close_all()
was returning the default value of 1000 for linger
instead of the updated one. I changed this so ,close_all()
accepts a linger
parameter.
By the way, there are 2 other tests currently failing and they are both related to Windows allowing more than one socket to listen on one address.
I will try to implement some kind of check to stop this from happening and manually raise an error if the user tries to do that. @Peque or should I just skip these tests on Windows and kindly advise people against this sort of practice?
Merging #252 into master will decrease coverage by
0.05%
. The diff coverage is96.22%
.
@@ Coverage Diff @@
## master #252 +/- ##
==========================================
- Coverage 99.08% 99.03% -0.06%
==========================================
Files 26 26
Lines 3503 3528 +25
Branches 250 251 +1
==========================================
+ Hits 3471 3494 +23
- Misses 20 21 +1
- Partials 12 13 +1
Impacted Files | Coverage Δ | |
---|---|---|
osbrain/tests/test_proxy.py | 100% <100%> (ø) |
:arrow_up: |
osbrain/tests/test_agent_ipc_sockets.py | 100% <100%> (ø) |
:arrow_up: |
osbrain/__init__.py | 100% <100%> (ø) |
:arrow_up: |
osbrain/nameserver.py | 99.21% <100%> (+0.01%) |
:arrow_up: |
osbrain/agent.py | 97.7% <100%> (ø) |
:arrow_up: |
osbrain/tests/common.py | 92.85% <100%> (+2.38%) |
:arrow_up: |
osbrain/tests/test_agent.py | 100% <100%> (ø) |
:arrow_up: |
osbrain/tests/test_nameserver.py | 98.11% <100%> (+0.01%) |
:arrow_up: |
osbrain/tests/test_agent_transport.py | 93.97% <88.88%> (-2.13%) |
:arrow_down: |
Continue to review full report at Codecov.
Legend - Click here to learn more
Δ = absolute <relative> (impact)
,ø = not affected
,? = missing data
Powered by Codecov. Last update a0d96cc...f6482ec. Read the comment docs.
@ocaballeror I would say, for now, we just skip failing tests for Windows. Then we can create specific issues for those fails and decide whether we should fix them or just assume they should fail in Windows.
Also, that way we keep this PR a bit simpler and merge it faster (even if it is not perfect, it is definitely a step forward that we want to include in our code base). :blush:
@Peque Maybe I should create another flag? Something like skip_for_now
so we know that specific test still needs reviewing?
@ocaballeror I fear the skip_for_now
name... :joy: (you know, sometimes "temporary" fixes end up being permanent...)
I like the idea of using more specific flags though. Maybe skip_windows_ipc
and a generic skip_windows_broken
for (currently) unknown problems.
Added a bunch more @skip_windows
indicators, to be able to differentiate between all the roadblocks we're running into.
Finally 😄
This took way longer than I expected
Continuing the discussion from #147 .
Also, just to be clear, I don't intend to merge this yet, so there may be some weird comments and print statements around the code.