apache / submarine

Submarine is Cloud Native Machine Learning Platform.
https://submarine.apache.org/
Apache License 2.0
691 stars 254 forks source link

SUBMARINE-1358. fix the potential timeout issue while building jupyter image #1029

Closed FatalLin closed 1 year ago

FatalLin commented 1 year ago

What is this PR for?

The github action hit the timeout exception while building the image of jupyter and jupyter-gpu time to time since the previous days. After couple testing, I found it indeed takes infinite time to install CVXPY with conda. Per the discussion on weekly meeting, I change the way to install CVXPY from conda to pip, and it also bring s some side-effect here like the some dependent package is too old and cause "use_2to3 is invalid" error, so some of the package had also been upgraded as well in this PR.

What type of PR is it?

Improvement

Todos

What is the Jira issue?

https://issues.apache.org/jira/browse/SUBMARINE-1358?

How should this be tested?

I build and deploy the image in my local environment and looks like everything is worked.

Screenshots (if appropriate)

Questions:

codecov[bot] commented 1 year ago

Codecov Report

Merging #1029 (8a76fc5) into master (fd692e2) will increase coverage by 50.48%. The diff coverage is n/a.

@@              Coverage Diff              @@
##             master    #1029       +/-   ##
=============================================
+ Coverage     25.49%   75.98%   +50.48%     
=============================================
  Files           358      119      -239     
  Lines         29089     5000    -24089     
  Branches       3479        0     -3479     
=============================================
- Hits           7415     3799     -3616     
+ Misses        21460     1201    -20259     
+ Partials        214        0      -214     
Flag Coverage Δ
python-integration 59.72% <ø> (+0.12%) :arrow_up:
python-unit 52.48% <ø> (+0.28%) :arrow_up:

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
...arine-sdk/pysubmarine/submarine/tracking/client.py 84.09% <0.00%> (-1.99%) :arrow_down:
...ver/api/experimenttemplate/ExperimentTemplate.java
...a/org/apache/submarine/server/SubmarineServer.java
...marine/commons/runtime/api/JobComponentStatus.java
...e/server/utils/response/JsonExclusionStrategy.java
...ubmarine/server/rest/workbench/SysDeptRestApi.java
...abase/model/service/RegisteredModelTagService.java
...k8s/experiment/codelocalizer/GitCodeLocalizer.java
...r/database/experiment/entity/ExperimentEntity.java
...mitter/k8s/model/common/PersistentVolumeClaim.java
... and 244 more

:mega: We’re building smart automated test selection to slash your CI/CD build times. Learn more

cdmikechen commented 1 year ago

@FatalLin Hi~ I would like to know if this fix is already available for building images locally, please?

FatalLin commented 1 year ago

@FatalLin Hi~ I would like to know if this fix is already available for building images locally, please?

Yes, just like I mentioned in description, both the jupyter and jupyter-gpu could be built in my local environment.

FatalLin commented 1 year ago

@FatalLin I think this should be fine, but I just have a question: Can we add a version range limit on the changes like @pingsutw mentioned last time? This would avoid the problem of new versions causing build failures again in the future.

do you mean fix the version of all dependency?

cdmikechen commented 1 year ago

@FatalLin I think this should be fine, but I just have a question: Can we add a version range limit on the changes like @pingsutw mentioned last time? This would avoid the problem of new versions causing build failures again in the future.

do you mean fix the version of all dependency?

I mean the lines you changed with pip install.

FatalLin commented 1 year ago

@FatalLin I think this should be fine, but I just have a question: Can we add a version range limit on the changes like @pingsutw mentioned last time? This would avoid the problem of new versions causing build failures again in the future.

do you mean fix the version of all dependency?

I mean the lines you changed with pip install.

I think we could fix the version at 1.2.2 - which would be the same with the current command, Im not favor in using a range of version on single dependency is because somehow it may bring unexpected changes from their own dependency, just like it require the version of nodeJs is higher than 14.0 this time.

cdmikechen commented 1 year ago

@FatalLin I think this should be fine, but I just have a question: Can we add a version range limit on the changes like @pingsutw mentioned last time? This would avoid the problem of new versions causing build failures again in the future.

do you mean fix the version of all dependency?

I mean the lines you changed with pip install.

I think we could fix the version at 1.2.2 - which would be the same with the current command, Im not favor in using a range of version on single dependency is because somehow it may bring unexpected changes from their own dependency, just like it require the version of nodeJs is higher than 14.0 this time.

Is it possible to determine the range/max of versions of cvxpy and pyqlib or nodejs supported by conda 4.11 in the official jupyter project? Or is it possible that using the conda command will automatically install the best version for the current environment?

If we don't specify the version range, every install command will get the latest version and at some point in the future if there is a conflict between the new versions it will again break the build, which is what I'm worried about.

FatalLin commented 1 year ago

Contributor

@FatalLin I think this should be fine, but I just have a question: Can we add a version range limit on the changes like @pingsutw mentioned last time? This would avoid the problem of new versions causing build failures again in the future.

do you mean fix the version of all dependency?

I mean the lines you changed with pip install.

I think we could fix the version at 1.2.2 - which would be the same with the current command, Im not favor in using a range of version on single dependency is because somehow it may bring unexpected changes from their own dependency, just like it require the version of nodeJs is higher than 14.0 this time.

Is it possible to determine the range/max of versions of cvxpy and pyqlib or nodejs supported by conda 4.11 in the official jupyter project? Or is it possible that using the conda command will automatically install the best version for the current environment?

If we don't specify the version range, every install command will get the latest version and at some point in the future if there is a conflict between the new versions it will again break the build, which is what I'm worried about.

I don't get it, if your purpose is trying to avoid potential conflict, why dont we just fix all the version here?

cdmikechen commented 1 year ago

Contributor

@FatalLin I think this should be fine, but I just have a question: Can we add a version range limit on the changes like @pingsutw mentioned last time? This would avoid the problem of new versions causing build failures again in the future.

do you mean fix the version of all dependency?

I mean the lines you changed with pip install.

I think we could fix the version at 1.2.2 - which would be the same with the current command, Im not favor in using a range of version on single dependency is because somehow it may bring unexpected changes from their own dependency, just like it require the version of nodeJs is higher than 14.0 this time.

Is it possible to determine the range/max of versions of cvxpy and pyqlib or nodejs supported by conda 4.11 in the official jupyter project? Or is it possible that using the conda command will automatically install the best version for the current environment? If we don't specify the version range, every install command will get the latest version and at some point in the future if there is a conflict between the new versions it will again break the build, which is what I'm worried about.

I don't get it, if your purpose is trying to avoid potential conflict, why dont we just fix all the version here?

I'm not a heavy user of conda, but I think conda should install the version of the resource that matches the dependencies according to its own environment (4.11). So I think we should just make sure that the cvxpy and pyqlib version which you changed using the pip install command matches or try using the conda install command to verify it.

FatalLin commented 1 year ago

Contributor

@FatalLin I think this should be fine, but I just have a question: Can we add a version range limit on the changes like @pingsutw mentioned last time? This would avoid the problem of new versions causing build failures again in the future.

do you mean fix the version of all dependency?

I mean the lines you changed with pip install.

I think we could fix the version at 1.2.2 - which would be the same with the current command, Im not favor in using a range of version on single dependency is because somehow it may bring unexpected changes from their own dependency, just like it require the version of nodeJs is higher than 14.0 this time.

Is it possible to determine the range/max of versions of cvxpy and pyqlib or nodejs supported by conda 4.11 in the official jupyter project? Or is it possible that using the conda command will automatically install the best version for the current environment? If we don't specify the version range, every install command will get the latest version and at some point in the future if there is a conflict between the new versions it will again break the build, which is what I'm worried about.

I don't get it, if your purpose is trying to avoid potential conflict, why dont we just fix all the version here?

I'm not a heavy user of conda, but I think conda should install the version of the resource that matches the dependencies according to its own environment (4.11). So I think we should just make sure that the cvxpy and pyqlib version which you changed using the pip install command matches or try using the conda install command to verify it.

As my understanding, the version of the package we install via conda is more associated with the package itself(branch/tag/fix version...etc) and less with the conda's version. Since both the cvxpy and pyqlib is not required by the jupyter, and vise versa, I didn't find any official document to indicate the compatible version between these packages( it would be great appreciated if you point them out). If we want avoid the potential conflict in the future, based on my personal experience, the proper way is fixing all the package in a single version, no matter they are installed by conda or pip.

cdmikechen commented 1 year ago

Here is the original issue link about qlib: https://issues.apache.org/jira/browse/SUBMARINE-725 We can discuss how to determine the version of qlib and the other related dependencies at the next meeting.

FatalLin commented 1 year ago

@pingsutw could you please some suggestion on the version of these packages we installed? Is that still fulfill your origin thought if we fix all the version on which the current stable is? Or its better to keep them to install the version stable branch.