Open PhilipDeegan opened 1 year ago
Sorry, just discovering this thread :o, wow the amount of work is huge !
Honestly, these tests involving randomness are very bad designed, but I didn't know that back then :).
About changing the AUC threshold from .7 to .69, this is not an issue at all. The only thing we are checking is that the model is learning something significantly better than random (which would lead to 0.5), so 0.65 or 0.6 would also be fine.
About the darwin test filtering, this is due to
# This is temporary
# At the moment, seeds are not cross platform as distributions in
# C++ depends on standard library shipped with compiler
import os
if os.name == 'posix':
import platform
if platform.system() == 'Darwin':
# We check that we get the same as what was recorded
np.testing.assert_almost_equal(seeded_sample_1, seeded_sample)
(note the "temporary"...). That is also bad designed. The test purpose was to check if with the same seed we were obtaining the same random numbers. But nothing really guarantees this and I guess an update in darwin random number generation can break this easily.
I would thus simply remove this part of the test as well as all seeded_sample
arguments. I would need help tell me, otherwise I'm quite happy to not setup tick, cmake and swig on my machine :)
Biggest change really is now,
Fixes somethings from https://github.com/X-DataInitiative/tick/issues/504
OSX still has some weird issues with random, and I'm guessing this is the reason why some real tests are failing now
for instance
I changed it from
.7
to.69
, but I'm not sure it really should be changed, maybe it's ok? @Mbompr thoughts?With this pull request, we should be able to publish new wheels to both test-pypi repo on master merge And push to pypi regular on release/tag