epam / cloud-pipeline

Cloud agnostic genomics analysis, scientific computation and storage platform
https://cloud-pipeline.com
Apache License 2.0
145 stars 59 forks source link

Allow to run Windows-based applications in the Cloud Pipeline #1832

Open sidoruka opened 3 years ago

sidoruka commented 3 years ago

Background At the moment, Cloud Pipeline is a fully Linux-based solution, but there is a growing demand on running Windows-based apps as well (especially for the Modelling and Simulation, PK/PD use cases). There was an attempt (#755) to implement this via the Windows containers, but it failed as only "headless" applications can be run this way. The majority of the relevant apps are GUI-based. Let's get back to this task and implement it in a different way. The final goal is the same: "Overall, the look and feel of Windows-based apps shall be the same as for the Linux (to the best possible extent)"

Approach

Action Items

tcibinan commented 3 years ago

The following scope of issues have to be resolved in order to provide complete Windows experience in Cloud Pipeline.

@Wedds

@Wedds Could you please implement the following tasks?

  1. [x] Support tools scanning

    Currently Windows tools scanning fails due to NullPointerException in API. It seems that the Clair itself can scan those images and the fix should be pretty simple.

  2. [x] Install cloud data desktop application

    Cloud Data desktop application should be installed to Windows node during launch.ps1 execution. All data storage allowed to owner should be available in Cloud Data desktop application.

    Ask @rodichenko about the details on how to install and configure Cloud Data desktop application.

  3. [x] Disable Windows node reassignment

    Autoscaler should not reassign Windows nodes. Autoscaler can determine Windows workstation by the corresponding node label.

  4. [x] Disable hot node pools of Windows nodes

    Hot node pools of Windows nodes should not be allowed.

  5. [x] Support data storage mounts as disks

    NFS data storages should be mounted to Windows node during launch.ps1 execution. All NFS data storages allowed to owner should be available as disks accessible from Windows Explorer.

  6. [x] Terminate Windows node once its run finishes

  7. [x] Allow to run unscanned tool versions

  8. [x] Use Windows tool platform in scan results

  9. [x] Resolve kubelet node labels race condition

  10. [x] Investigate different home directory name resolution

  11. [x] Prepare Windows support pull requests

  12. [x] Review Windows support pull requests non-API changes

  13. [x] Remove extra shortcuts from desktop

  14. [x] Remove instance details from desktop

  15. [x] Remove extra items from taskbar

  16. [x] Disable Windows tools scanning

  17. [x] Replace OWNER_GROUPS run parameter with a contextual preference which defaults to Users

  18. [x] Merge Windows support pull requests to develop

@rodichenko

@rodichenko Could you please implement the following tasks?

  1. [x] Disable commit for Windows runs

    Commit button on run page should be hidden if the run's platform equals to windows.

  2. [x] Support tools instance images setting

    Add an additional optional field to tool settings page called instance image.

    See https://github.com/epam/cloud-pipeline/pull/1518.

  3. [x] Remove add to white list and scan buttons from tool versions page for Windows tools

  4. [x] Show tools platform as Linux/Windows icon

  5. [x] Remove limit mounts sections from tool settings and launch pages for Windows tools

  6. [x] Remove allow sensitive storages checkbox from tool settings page for Windows tools

  7. [x] Remove configure cluster button from tool settings and launch pages for Windows tools

  8. [x] Add distinct run capabilities for Linux and Windows (empty for now) tools

  9. [x] Remove vulnerabilities report and packages tabs from tool version details page for Windows tools

  10. [x] Do not allow Windows tools in hot node tools

  11. [x] Remove browse button from run page for Windows runs

  12. [x] Remove monitor tab from node page for Windows nodes

  13. [x] Add missing general info to node page for Windows nodes

  14. [x] Remove Windows scan description from tool version page for Windows tools

@mzueva

@mzueva Could you please implement the following tasks?

  1. [x] Review Windows support pull requests API changes

@tcibinan

Tasks for myself.

  1. [x] Reactivate windows license after cluster joining
  2. [x] Get rid of Kubernetes config usage in init_multicloud.ps1
  3. [x] Apply similar changes to all cloud providers nodeup.py scripts
  4. [x] Add licenses to all new scripts
  5. [x] Resolve checkstyle and pmd issues
  6. [x] Move pod launch commands to system preferences
  7. [x] Replace launch.ps1 with launch.py
  8. [x] Install Google Chrome in ami
  9. [x] Support proxying in pipe tunnel
  10. [x] Support ssh to Windows nodes using wetty
  11. [x] Support auto login as Cloud Pipeline user
  12. [x] Support both admin and non admin user
  13. [x] Allow access to kube config to only admin user
  14. [x] Download all resource in init_multicloud.ps1 from open source data storage
  15. [x] Resolve tool platform from tool latest version on the fly
  16. [x] Prepare Windows support pull requests
  17. [x] Automatically resolve network adapter name in node init scripts
  18. [x] Review Windows support pull requests non-API changes
  19. [x] Initialize Windows node non root disks
  20. [x] Update Windows account password requirements

Pull requests

  1. [x] #1953
  2. [x] #1967
  3. [x] #1968
  4. [x] #1969
  5. [x] #2002
  6. [x] #2003
  7. [x] #2006
  8. [x] #2007
  9. [x] #2009
  10. [x] #2010
  11. [ ] #2030
  12. [ ] #2036

  13. [x] #2017

  14. [x] #2029

Backlog

  1. [ ] Adapt NoMachine resolution to user's workstation resolution
  2. [ ] Investigate possible AD integration

  3. [ ] Increase kubelet image-pull-progress-deadline duration to some bigger value
  4. [ ] Embed Windows tool into Windows ami
  5. [ ] Upload Windows python packages to Cloud Pipeline pypi repository
  6. [ ] Persist environment to powershell and batch profiles
  7. [ ] Support cp-node-logger on Windows nodes
  8. [ ] Support oom-reporter on Windows nodes
  9. [ ] Support pause of Windows nodes
  10. [ ] Automate Windows tools building and publishing
  11. [ ] Extract cidr to nodeup.py parameter
  12. [ ] Reduce Windows docker image size
  13. [ ] Investigate and reduce node/pod startup times
  14. [ ] Create windows base ami with gpu support
  15. [ ] Add Windows ami configuration to cluster.networks.config by default
  16. [ ] Configure user default administrative permissions
  17. [ ] Move launch templates to deployment
  18. [ ] Extract Windows platform check in the server operations to utils or PipelineRun class

Technical debt

  1. [ ] Replace double quotes with single quites in most places.
  2. [ ] Use os.get_env rather than os.environ.get.
  3. [ ] Use multiline format strings in launch.py.
  4. [ ] Extract current platform checks to pipe common utils.
  5. [x] Use function foo($bar) {} rather than foo {param($bar)} in powershell.
  6. [ ] Do not use python 3 features in pipe common scripts.
  7. [x] Handle empty host_and_port string in _parse_host_and_port in launch.py.
  8. [x] Move declaration of edge_root_cert_path, edge_host and edge_port to the beggining of launch.py.
  9. [x] Pass DavURL as parameter to MountDrive.ps1 script.
  10. [x] Use os.path.join(root, path1, path2) rather than os.path.join(root, f'{path1}\{path2}').
  11. [x] Make drive mapping notification optional depending on some run parameter.
  12. [x] Merge configure_cloud_data_win.py and schedule_cloud_data_configuration_finalization_win.py scripts.
  13. [x] Merge add_root_cert_to_trusted_root_win.py, configure_drive_mount_env_win.py and schedule_drive_mapping_win.py scripts.
  14. [x] Move workflows/pipe-common/config/default_layout_win.xml to Cloud Pipeline static resources.
  15. [x] Extract placeholders management from scheduler.py to the call site.
  16. [x] Move C:\Users\Public\Desktop\NoMachine.lnk deletion to configure_default_desktop_win.py
rodichenko commented 3 years ago

@tcibinan GUI tasks are implemented (6f6344729fabcb103ca3d463ce98ef7ec314d210, branch gui-windows-based-tools-1832)

rodichenko commented 3 years ago

@mzueva @sidoruka @tcibinan GUI part merged (#2017)

maryvictol commented 3 years ago

Warning message "A large number of the object data storages are going to be mounted for this job" shouldn't be shown at launch of Windows tool. image

maryvictol commented 3 years ago

Pause and Versioned Storages links don't work now and should be removed.

maryvictol commented 3 years ago

On Windows machine:

  1. [ ] question 1. Admins cannot access efs storages via webdav without explicit storage permissions. Is it correct behaviour?
  2. [ ] question 2. At case of user isn't owner any storage or doesn't have permission to any storages message Authentication succeeded for Cloud Storage Drive mapping/ Please follow instructions to configure.... is shown several times at openning Windows machine. Final message is StorageMapping [ERROR] WebDav mapping: Drive mapping failed!. Is it correct message or it should give reason of error?

Issues:

  1. [ ] issue 1. Cloud Data app: Create directory button should be disabled on storages level.
  2. [ ] issue 2. Cloud Data app: Copy, Move, Delete buttons should be disabled for storage. Copy, Move buttons should be disabled for files/folders in the local directory if Cloud Data Root is opened on the right side.
  3. [ ] issue 3. Windows Explorer: Delete option shouldn't be applied to the storage in the Windows explorer. Now after confirm permanently deletion this folder all storage entries are deleted.
  4. [ ] issue 4. .DAV folder with temporary service .pag and .dir files is created in the storage when files/folders add to the storage via Windows Explorer. It should be hide and removed from storage.
  5. [ ] issue 5. "Storage" should be changed to "Storages" in the info message "Storage available for 'USER' are mounted into Z:\" on "StorageMapping" popup.
  6. [ ] issue 6. Attemption to save .docx file opened and edited in the WordPad directly from storage give error "Failed to save document".
maryvictol commented 3 years ago
Big files copy research: File size 976.59Mb 1.5Gb 4.7Gb
Copy file via CloudData:
Windows --> Storage - File is copied
- File appears in the storage after reload only
- File is copied
- File appears in the storage after 2-3 min reload
- Progress bar displays that file is copied
-After that all info exclude modal name disappears from the CloudData modal
- File doesn't appears in the storage
Windows --> FS Storage - File is copied
- File appears in the storage after reload only
- File is copied
- File appears in the storage after reload only
- File is copied
- File appears in the storage after reload only
Storage --> Windows The file is copied in about 7 min The file is copied in about 8 minutes The file is copied in about 35 minutes
FS Storage --> Windows The file is copied in about 12 min The file is copied in about 12 minutes The file is copied in about 42 minutes
Copy file via WindowsExplorer:
Windows --> Storage - File is copied
- The process hangs for about a 1min by 99%
- File is copied
- The process hangs for about a 1min by 99%
- The process reaches 99% in a 1min and hangs for 2 min
- After that the Replace or Skip Files modal appears with info that The destination already has a file named " ... "
- Copied file with 0 size appears in the storage
Windows --> FS Storage - File is copied
- The process hangs for about a 1min by 99%
- File is copied
- The process hangs for about a 1min by 99%
- The process reaches 99% in a 1min and hangs for 2 min
- After that the Replace or Skip Files modal appears with info that The destination already has a file named " ... " <Click on Replace file option gives error message The file exceeds the limit allowed and cannot be saved.
- Copied file with 0 size appears in the storage
Storage --> Windows The file is copied in about 1 min The file is copied in about 1 min - Error message The file exceeds the limit allowed and cannot be saved.
- The file isn't copied
FS Storage --> Windows The file is copied in about 1 min The file is copied in about 1 min - Error message The file exceeds the limit allowed and cannot be saved.
- The file isn't copied
rodichenko commented 3 years ago

Pause and Versioned Storages links don't work now and should be removed.

@maryvictol should be fixed with ccf7b3e3685f11bf35580a3ea0f727b3b261682f

rodichenko commented 3 years ago

Warning message "A large number of the object data storages are going to be mounted for this job" shouldn't be shown at launch of Windows tool. image

@maryvictol warning message was removed (f24a58360d2ff5be66abb7467c93f8f076837345)

maryvictol commented 3 years ago

Pause and Versioned Storages links don't work now and should be removed.

@maryvictol should be fixed with ccf7b3e

Verified as fixed

maryvictol commented 3 years ago

Warning message "A large number of the object data storages are going to be mounted for this job" shouldn't be shown at launch of Windows tool.

@maryvictol warning message was removed (f24a583)

Verified as fixed