apache / submarine

Submarine is Cloud Native Machine Learning Platform.
https://submarine.apache.org/
Apache License 2.0
691 stars 254 forks source link

SUBMARINE-1363. Git-Sync container can not pull specified branch or main codes #1064

Closed cdmikechen closed 1 year ago

cdmikechen commented 1 year ago

What is this PR for?

Make submarine support for pulling git with branch, username and password. In the meantime, some python CI issues have been discovered while changing codes. This has now been fixed as well.

What type of PR is it?

Bug Fix

Todos

What is the Jira issue?

https://issues.apache.org/jira/browse/SUBMARINE-1363

How should this be tested?

cicd

Screenshots (if appropriate)

image

Questions:

codecov[bot] commented 1 year ago

Codecov Report

Merging #1064 (0632373) into master (21ba037) will increase coverage by 0.01%. The diff coverage is 72.80%.

@@            Coverage Diff             @@
##           master    #1064      +/-   ##
==========================================
+ Coverage   67.25%   67.27%   +0.01%     
==========================================
  Files         127      128       +1     
  Lines        6139     6224      +85     
==========================================
+ Hits         4129     4187      +58     
- Misses       2010     2037      +27     
Flag Coverage Δ
python-integration 54.20% <72.80%> (+0.19%) :arrow_up:
python-unit 48.00% <44.73%> (-0.07%) :arrow_down:

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
...submarine/submarine/client/models/git_code_spec.py 67.46% <67.46%> (ø)
...ne/submarine/client/models/experiment_task_spec.py 72.26% <77.77%> (ø)
...arine/submarine/client/models/notebook_pod_spec.py 69.87% <77.77%> (ø)
...arine-sdk/pysubmarine/submarine/client/__init__.py 100.00% <100.00%> (ø)
...dk/pysubmarine/submarine/client/models/__init__.py 100.00% <100.00%> (ø)
...k/pysubmarine/submarine/client/models/code_spec.py 66.07% <100.00%> (ø)

:mega: We’re building smart automated test selection to slash your CI/CD build times. Learn more

cdmikechen commented 1 year ago

I have found that most of the current integration tests (e.g. https://github.com/apache/submarine/actions/runs/5345025163/jobs/9690142792?pr=1064) fail with error ERROR: Could not install packages due to an OSError: [Errno 28] No space left on device when installing tf dependencies.

image

I have used the df command in the CI step to output the storage of the current container and have not yet found any particular exceptions:

image
cdmikechen commented 1 year ago

In some CI workflows, the following error also occurs:

image

This error started today and I am suspecting that it may have something to do with a network problem when pulling tensorflow files.

cdmikechen commented 1 year ago

@pingsutw Hi~ I have fixed the CI related issues and all checks are now OK. You can review this PR if it is convenient for you. To deal with the problem of no space left error when installing pip dependencies, I have moved up the pip install steps and cleared the pip cache after installation. This seems to have eliminated the problem for now.