Closed EgorBu closed 5 years ago
I cannot reproduce:
Though I can see that there is a bug in rule descriptions, will fix in same PR as the one addressing #221.
@EgorBu, do you still have this problem?
Yes, still have problems. I used bisect to find commit where things go wrong:
057d7973894e6147236d4633a1554f4bb02415be is the first bad commit
commit 057d7973894e6147236d4633a1554f4bb02415be
Author: Hugo Mougard <hugo@sourced.tech>
Date: Wed Oct 3 01:43:51 2018 +0200
Revamp feature names handling and related feature extraction mechanisms
Signed-off-by: Hugo Mougard <hugo@sourced.tech>
:040000 040000 31d82cad4f4fae295c94c41f63a52f10444659d0 3b0c1b3c9888557b694e7a90afe0fffbba9098a6 M lookout
I don't know what exactly wrong - I will continue investigation. Additional details - launching the same model on the same dataset twice gives different results:
And it's super strange.
Can you also give your training code?
Analyzer:
analyzer run lookout.style.format -c config.yml --log-level DEBUG
with config:
server: 0.0.0.0:2000
db: sqlite:////tmp/lookout.sqlite
fs: /tmp
Training query
lookout-sdk_linux_amd64/lookout-sdk push ipv4://localhost:2000 --git-dir /home/egor/workspace/tmp/freeCodeCamp/ --from HEAD^ --to HEAD
I'm using the latest lookout sdk
Ok, just to be sure I also trained in this way (even though it shouldn't differ that much from the research script) but I still cannot reproduce :(
yes, it's super strange (especially part where 2 eval
queries give different results - I don't understand how it could be randomized).
just to make sure: how do you pick up the model after training?
Quality report query:
python3 -m lookout.style.format eval -i "/home/egor/workspace/tmp/freeCodeCamp_no_min.js/**/*" -m /tmp/home/egor/workspace/tmp/freeCodeCamp/style.format.analyzer.FormatAnalyzer_1.asdf -n 10
and during bisect search I removed folder with model (rm -rf /tmp/home
) after each experiment
@EgorBu @m09 do not forget to sync your package versions from the requirements.
cat requirements.txt | cut -f1 -d= | grep -v '#' | xargs pip3 show | grep -A1 Name:
Name: sourced-ml
Version: 0.6.0
--
Name: xxhash
Version: 1.2.0
--
Name: stringcase
Version: 1.2.0
--
Name: SQLAlchemy
Version: 1.2.10
--
Name: SQLAlchemy-Utils
Version: 0.33.3
--
Name: Pympler
Version: 0.5
--
Name: cachetools
Version: 2.0.1
--
Name: ConfigArgParse
Version: 0.13.0
--
Name: humanfriendly
Version: 4.16.1
--
Name: psycopg2-binary
Version: 2.7.5
--
Name: scikit-learn
Version: 0.19.2
--
Name: tqdm
Version: 4.11.2
--
Name: scikit-optimize
Version: 0.5.2
--
Name: pandas
Version: 0.21.0
I have the same except for pandas but it should only be used by typos. Though I do have the latest packages we added to setup in there (gensim, google-compute-engine). Egor, do you still have the bad quality if you re-run
pip install -e .
?
Aaaaaaaah, I can reproduce with python 3.5, nice. I'll look into it now :)
yes, I'm using
egor@egor-sourced:~/workspace/style-analyzer$ python3 --version
Python 3.5.2
which version did you use before?
I usually use Python 3.6.4
interesting what could cause the difference in behaviour between versions
I remember that @smacker told us that Apollo worked differently on different Python versions.
So eager to find out what's the problem. Also curious why the tests did not catch that. Given the fixed random seeds, all the tests should work the same.
It should be fixed by #240. With the PR back on forth of #204 I introduced an extra data-structure that should have been ordered and wasn't. Python 3.5 doesn't guarantee the order of keys during iteration but 3.6 does as @EgorBu found out.
@vmarkovtsev regarding tests not catching it, it should be because the order is consistent in a single python interpreter.
Oh. The mystery solved! Thanks!
P.S. I'm not sure if there is such bug in Apollo right now. We use a quite old version.
Hi, I checked quality with current master using
eval
and quality drop is significant:and stats about used rules