Closed Guillemdb closed 5 years ago
Well, as soon as it is almost the same files we have in src-d/ml can you be more specific how do you change it if you did. What I should pay more attention to?
The files are the same, the only stuff I did was:
The functionality has not been changes, I just changed the file locations.
I pushed a new commit with the requested changes and the .travis.yml file. I corrected the flake8 issues in another branch, if you allow me to ignore them in the next PR they will be fixed.
In the .travis.yml
file I have commented the deployment stage untill I get the correct environment variables.
Btw, @marnovo told me that there is no consensus on the year of the license, so I didn't make any change to it.
The build is failing because I remove the references to bblfsh on the MAKEFILE. Should I add it again so the build passes?
oh, sure. In this case, we should keep it.
When I run the tests on my local repository I find that only one of them is failing:
======================================================================
FAIL: test_result (ml_core.tests.test_uast_to_id_distance.Uast2IdLineDistanceTest)
----------------------------------------------------------------------
Traceback (most recent call last):
File "/home/guillem/srcd/ml-core/ml_core/tests/test_uast_to_id_distance.py", line 141, in test_result
self.assertEqual(res, correct)
AssertionError: Lists differ: [(('__spec__', 'ModuleSpec'), 0), (('__spec__',[1679 chars], 1)] != [(('__package__', 'ModuleSpec'), 2), (('__spec_[2142 chars], 1)]
First differing element 0:
(('__spec__', 'ModuleSpec'), 0)
(('__package__', 'ModuleSpec'), 2)
Second list contains 14 additional elements.
First extra element 55:
(('utmain', 'collections'), 2)
+ [(('__package__', 'ModuleSpec'), 2),
- [(('__spec__', 'ModuleSpec'), 0),
? ^
+ (('__spec__', 'ModuleSpec'), 0),
? ^
(('__spec__', 'ModuleSpec'), 1),
(('__spec__', 'ModuleSpec'), 1),
+ (('__spec__', 'ModuleSpec'), 2),
(('__spec__', '__package__'), 0),
+ (('collections', 'ModuleSpec'), 1),
(('collections', 'ModuleSpec'), 2),
(('collections', '__package__'), 1),
(('collections', '__spec__'), 1),
+ (('collections', '__spec__'), 2),
(('modules', '__package__'), 1),
(('modules', '__spec__'), 1),
(('modules', 'collections'), 2),
(('namedtuple', 'ModuleSpec'), 0),
(('namedtuple', 'ModuleSpec'), 1),
+ (('namedtuple', 'ModuleSpec'), 1),
(('namedtuple', 'ModuleSpec'), 2),
(('namedtuple', 'ModuleSpec'), 2),
(('namedtuple', '__package__'), 1),
+ (('namedtuple', '__package__'), 2),
(('namedtuple', '__spec__'), 1),
(('namedtuple', '__spec__'), 1),
+ (('namedtuple', '__spec__'), 2),
+ (('namedtuple', '__spec__'), 2),
(('namedtuple', 'collections'), 0),
- (('namedtuple', 'collections'), 2),
? ^
+ (('namedtuple', 'collections'), 1),
? ^
(('namedtuple', 'modules'), 2),
(('setup_logging', 'modelforge.logs'), 0),
(('setup_logging', 'setup'), 1),
(('sys', '__package__'), 1),
(('sys', '__spec__'), 1),
(('sys', 'collections'), 2),
(('sys', 'modelforge.logs'), 2),
(('sys', 'modules'), 0),
(('sys', 'namedtuple'), 2),
(('sys', 'setup_logging'), 2),
(('utmain', 'ModuleSpec'), 0),
(('utmain', 'ModuleSpec'), 1),
(('utmain', 'ModuleSpec'), 1),
(('utmain', 'ModuleSpec'), 1),
(('utmain', 'ModuleSpec'), 2),
+ (('utmain', 'ModuleSpec'), 2),
+ (('utmain', 'ModuleSpec'), 2),
(('utmain', '__package__'), 0),
(('utmain', '__package__'), 0),
(('utmain', '__package__'), 1),
(('utmain', '__spec__'), 0),
(('utmain', '__spec__'), 0),
(('utmain', '__spec__'), 0),
(('utmain', '__spec__'), 1),
(('utmain', '__spec__'), 2),
(('utmain', 'collections'), 1),
(('utmain', 'collections'), 1),
(('utmain', 'collections'), 2),
+ (('utmain', 'collections'), 2),
(('utmain', 'modules'), 0),
(('utmain', 'modules'), 1),
(('utmain', 'modules'), 1),
(('utmain', 'namedtuple'), 1),
(('utmain', 'namedtuple'), 1),
(('utmain', 'namedtuple'), 1),
(('utmain', 'namedtuple'), 2),
+ (('utmain', 'namedtuple'), 2),
+ (('utmain', 'namedtuple'), 2),
+ (('utmain', 'namedtuple'), 2),
(('utmain', 'sys'), 0),
(('utmain', 'sys'), 1),
(('utmain', 'sys'), 1)]
Does the file test/source/example.py
depend on the project from where it is called?
Does this test somehow fail for something I did wrong?
However, during the CI, a lot of tests fail with a similar error to this one>
ERROR: test_greatest2 (ml_core.tests.test_df.DocumentFrequenciesTests)
----------------------------------------------------------------------
Traceback (most recent call last):
File "/ml_core/ml_core/tests/test_df.py", line 10, in setUp
self.model = DocumentFrequencies().load(source=paths.DOCFREQ)
File "/usr/local/lib/python3.6/dist-packages/modelforge/model.py", line 93, in load
cache_dir = os.path.join(self.cache_dir(), self.NAME)
File "/usr/local/lib/python3.6/dist-packages/modelforge/model.py", line 338, in cache_dir
raise RuntimeError("modelforge is not configured; look at modelforge.configuration. "
RuntimeError: modelforge is not configured; look at modelforge.configuration. Depending on your objective you may or may not want to create a modelforgecfg.py file which sets VENDOR and the rest.
The modelforgecfg.py
is created, but somehow the path to the files is "/ml_core/ml_core/tests/test_df.py"
, and the folder ml_core
is duplicated. Do you have any idea of why this happens?
I am creating an additional copy of ml_core
and moving the whole project inside that folder, but I'm not sure where. Could it be in the L29 of the Dockerfile?
I will check tomorrow.
I add my changes to the commit. Now everything works fine. There are several things that @Guillemdb before I can approve it I ask you to do several more things:
core_ml
to sourced/ml/core
. Address all core_ml
entries in code accordingly.sourced.ml.mining
) keep __init__.py
empty from pkg_resources.declare_namespace(__name__)
we will add it later. If we add it now, it is not possible to test it. sourced.ml
entries in the PR and address them accordingly. flake8-commas==2.0.0
to requrements-lint.txt and address new style issues.things that should be addressed in the next PRs (@Guillemdb you can convert this list to issues after we merge this PR):
I finally managed to fix the CI and upload the following changes:
ml_core
to sourced.ml.core
.flake8-commas==2.0.0
because black already takes care of it. (I would like to add black as a pre-commit hook in future PRs.)sourced.ml.core.tests.models
to sourced.ml.core.tests.models_tensorflow
to avoit conflicts with the file sourced.ml.core.tests.models.py
.test_serialize
from tests.models_tensorflow.test_tensorflow
due to an unknown error with stack-trace:
======================================================================
ERROR: test_serialize (ml.core.tests.models_tensorflow.test_tensorflow.TensorFlowModelTests)
----------------------------------------------------------------------
Traceback (most recent call last):
File "/home/travis/virtualenv/python3.5.6/lib/python3.5/site-packages/sourced/ml/core/tests/models_tensorflow/test_tensorflow.py", line 18, in test_serialize
TensorFlowModel().construct(graphdef=gd).save(buffer)
File "/home/travis/virtualenv/python3.5.6/lib/python3.5/site-packages/modelforge/model.py", line 390, in save
raise ValueError("series must be specified")
ValueError: series must be specified
I didn't implement flake8-commas==2.0.0 because black already takes care of it. (I would like to add black as a pre-commit hook in future PRs.)
Ok. One thing, you should add not a pre-commit hook but another style check (see https://github.com/src-d/ml-core/pull/1#issuecomment-484558774). I am fine if you do it in the next PR. In the best case, black check should completely replace flake8.
@Guillemdb @vmarkovtsev It would be desirable to preserve history or, at least, separate commits for code import and code changes, so it's clear the point at which it's imported and what changes were made. Please, ping me on Slack if there's any doubt about how to do this.
So far there is no changes to the code functionality besides fixing the Tensorflow test, I was waiting for this PR to be merged to make any change to the core-ml functionality if needed.
@Guillemdb But there were non-functional changes, right? Or just things like imports?
Yeah, all the changed were non-functional, what I did was:
core-ml
project.lookout-sdk-ml
project (as @vmarkovtsev suggested).@Guillemdb It is impossible for me to review 118 files. I will comment and the thing will turn into an unmanageable mess. Can you please split all the changes into feature-full 5-6 PRs and submit them one by one. It is fine that each following PR depends on the previous ones.
This is the first stage of the refactor of
src-d/ml
intoml-core
.