flyteorg / flyte

Scalable and flexible workflow orchestration platform that seamlessly unifies data, ML and analytics stacks.
https://flyte.org
Apache License 2.0
5.47k stars 584 forks source link

[Housekeeping] Test Flytectl & Flytekit on Windows #1561

Open samhita-alla opened 2 years ago

samhita-alla commented 2 years ago

Describe the Issue We haven't yet thoroughly tested Flytekit and Flytectl on a windows OS. Would love someone with a windows OS to test this out.

What if we do not do this? Flyte on windows may not be foolproof.

Related component flytekit, flytectl

samhita-alla commented 2 years ago

cc: @evalsocket

samhita-alla commented 2 years ago

@kuspia, please go ahead! Thank you for picking this up.

First, make sure the getting started guide works. Please share with us the screenshots of outputs of all the commands you run. Once you get the Flyte UI up and running, please send us a short video of an example workflow run.

You can join our Slack #hacktoberfest2021 channel and ask us there if you have any questions.

@evalsocket Is there anything specific we want to test on Windows?

martinlyra commented 2 years ago

Hello. Sorry to distrupt or step in. But I wanted to find out too and ran the commands from the Getting Started guide on my Windows (which is my driver computer). It seems to work at a glance. Do I share my findings here as well?

samhita-alla commented 2 years ago

Hi, @martinlyra! Thanks for running the code. Please do share your findings. :)

martinlyra commented 2 years ago

First I'll leave some specifications of my Windows version and installation of Python. Think that can help figuring out the differences if any.

Running python .\myapp\workflows\example.py from PowerShell

PS D:\flyte\flytekit-python-template> python .\myapp\workflows\example.py
{"asctime": "2021-10-06 22:39:31,727", "name": "flytekit", "levelname": "DEBUG", "message": "Task returns unnamed native tuple <class 'str'>"}
{"asctime": "2021-10-06 22:39:31,733", "name": "flytekit", "levelname": "DEBUG", "message": "Task returns unnamed native tuple <class 'str'>"}
{"asctime": "2021-10-06 22:39:31,733", "name": "flytekit", "levelname": "INFO", "message": "Invoking __main__.say_hello with inputs: {}"}
INFO:flytekit:Invoking __main__.say_hello with inputs: {}
{"asctime": "2021-10-06 22:39:31,733", "name": "flytekit", "levelname": "INFO", "message": "Task executed successfully in user level, outputs: hello world"}
INFO:flytekit:Task executed successfully in user level, outputs: hello world
Running my_wf() hello world

Running as module:

PS D:\flyte\flytekit-python-template> python -m myapp.workflows.example
{"asctime": "2021-10-07 20:27:37,303", "name": "flytekit", "levelname": "DEBUG", "message": "Task returns unnamed native tuple <class 'str'>"}
{"asctime": "2021-10-07 20:27:37,303", "name": "flytekit", "levelname": "DEBUG", "message": "Task returns unnamed native tuple <class 'str'>"}
{"asctime": "2021-10-07 20:27:37,329", "name": "flytekit", "levelname": "INFO", "message": "Invoking __main__.say_hello with inputs: {}"}
INFO:flytekit:Invoking __main__.say_hello with inputs: {}
{"asctime": "2021-10-07 20:27:37,329", "name": "flytekit", "levelname": "INFO", "message": "Task executed successfully in user level, outputs: hello world"}
INFO:flytekit:Task executed successfully in user level, outputs: hello world
Running my_wf() hello world

I'll try to do the next two getting-started when I get more time to. May take some while because this is completely new form of software for me to deal with.

samhita-alla commented 2 years ago

Oh sure! Thank you for doing this. :)

Apologies for not mentioning this before: can you use Python 3.8 to test the code? Flyte should be compatible with that specific Python version. Also, if possible, please share with us the command prompt screenshots.

cc: @kumare3

martinlyra commented 2 years ago

Sure! But why not hit two birds with one rock? ;)

I've installed 3.8.10 as a local installation in a folder close to the flyte project. To not downgrade the current system install of Python, and enables me to test Flyte in both 3.8 and 3.9.

.\python-3.8\python.exe -m pip install flytekit went through without problems

Running getting started again, it seems to be no obvious difference between 3.8 and 3.9 when running the first getting-started example

PS D:\flyte> .\python-3.8\python.exe .\flytekit-python-template\myapp\workflows\example.py
{"asctime": "2021-10-08 11:46:52,570", "name": "flytekit", "levelname": "DEBUG", "message": "Task returns unnamed native tuple <class 'str'>"}
{"asctime": "2021-10-08 11:46:52,570", "name": "flytekit", "levelname": "DEBUG", "message": "Task returns unnamed native tuple <class 'str'>"}
{"asctime": "2021-10-08 11:46:52,573", "name": "flytekit", "levelname": "INFO", "message": "Invoking __main__.say_hello with inputs: {}"}
INFO:flytekit:Invoking __main__.say_hello with inputs: {}
{"asctime": "2021-10-08 11:46:52,573", "name": "flytekit", "levelname": "INFO", "message": "Task executed successfully in user level, outputs: hello world"}
INFO:flytekit:Task executed successfully in user level, outputs: hello world
Running my_wf() hello world

If this is alright, I'll try to do the next ones with 3.8 like this.

samhita-alla commented 2 years ago

Sure! But why not hit two birds with one rock? ;)

I've installed 3.8.10 as a local installation in a folder close to the flyte project. To not downgrade the current system install of Python, and enables me to test Flyte in both 3.8 and 3.9.

.\python-3.8\python.exe -m pip install flytekit went through without problems

Running getting started again, it seems to be no obvious difference between 3.8 and 3.9 when running the first getting-started example

PS D:\flyte> .\python-3.8\python.exe .\flytekit-python-template\myapp\workflows\example.py
{"asctime": "2021-10-08 11:46:52,570", "name": "flytekit", "levelname": "DEBUG", "message": "Task returns unnamed native tuple <class 'str'>"}
{"asctime": "2021-10-08 11:46:52,570", "name": "flytekit", "levelname": "DEBUG", "message": "Task returns unnamed native tuple <class 'str'>"}
{"asctime": "2021-10-08 11:46:52,573", "name": "flytekit", "levelname": "INFO", "message": "Invoking __main__.say_hello with inputs: {}"}
INFO:flytekit:Invoking __main__.say_hello with inputs: {}
{"asctime": "2021-10-08 11:46:52,573", "name": "flytekit", "levelname": "INFO", "message": "Task executed successfully in user level, outputs: hello world"}
INFO:flytekit:Task executed successfully in user level, outputs: hello world
Running my_wf() hello world

If this is alright, I'll try to do the next ones with 3.8 like this.

Awesome! You were able to install everything smoothly, right? You can proceed with the next steps in the getting started guide.

martinlyra commented 2 years ago

Awesome! You were able to install everything smoothly, right? You can proceed with the next steps in the getting started guide.

Yes! Besides from the warnings about the modules not being installed to a folder included in my system PATH, which is intended anyways as far this 3.8 installation go.

Either way, moving on to flytectl, since PoweShell cannot run bash scripts as far I am aware. While it has curl and wget as aliases, to a command (I forgot) that requires Internet Explorer for parsing the HTTP. So I use Invoke-RestMethod https://raw.githubusercontent.com/flyteorg/flytectl/master/install.sh > install.sh, which works, but trying to run .\install.sh in PowerShell does not work.

So I instead use the Bash shell that was included with the Git installation by default. It has MINGW64, which should have bash and utilities enough to run the commands in 2nd step. But with some modification:

matly@DESKTOP-E3BK7I8 MINGW64 /d/flyte
$ curl -s https://raw.githubusercontent.com/flyteorg/flytectl/master/install.sh | bash -s -- -b flytectl

So it installs into D:\flyte\flytectl folder instead.

The next following commands gives:

PS D:\flyte> .\flytectl\flytectl.exe upgrade
time="2021-10-08T20:15:55+02:00" level=info msg="[0] Couldn't find a config file []. Relying on env vars and pflags."
{"json":{},"level":"warning","msg":"Starting an unauthenticated client because: failed to fetch auth metadata. Error: rpc error: code = Unavailable desc = connection error: desc = \"transport: Error while dialing dial tcp: missing address\"","ts":"2021-10-08T20:15:55+02:00"}
{"json":{},"level":"info","msg":"Initialized Admin client","ts":"2021-10-08T20:15:55+02:00"}
You have already latest version of flytectl

...and

PS D:\flyte> .\flytectl\flytectl.exe version
time="2021-10-08T20:16:04+02:00" level=info msg="[0] Couldn't find a config file []. Relying on env vars and pflags."
{"json":{},"level":"warning","msg":"Starting an unauthenticated client because: failed to fetch auth metadata. Error: rpc error: code = Unavailable desc = connection error: desc = \"transport: Error while dialing dial tcp: missing address\"","ts":"2021-10-08T20:16:04+02:00"}
{"json":{},"level":"info","msg":"Initialized Admin client","ts":"2021-10-08T20:16:04+02:00"}

 A new release of flytectl is available: 0.3.9 → v0.3.9

{
  "App": "flytectl",
  "Build": "83e1b61",
  "Version": "0.3.9",
  "BuildTime": "2021-10-08 20:16:03.9992127 +0200 CEST m=+0.018003501"
}

The next step is to get the cluster running, so I need to get both Docker and WSL (as required by Docker for Windows) installed and running. I'll be back when I have more news to share! At this point I would like to turn this into a blog, ha.

martinlyra commented 2 years ago

Alright. I have good and bad news!

Starting with the good. I have managed to go far enough to have the Flyte UI up and working. So I can interact and browse around in the window. It took a little while to get it running though.

The sandbox commands work as they seem to would do. Other than the fact the progress logging works oddly on Windows: As it produces a stream of new lines rather than updating the output buffer. Resulting in every update being printed as new lines to the log.

Now, the bad news. My troubles start with the pyflyte module. Which I run directly from the python's scripts folder due to it not being in my PATH variable. But that beside, the real trouble is that the script seems to use Linux-exclusive library posix. Which is not available for Windows Python.

PS D:\flyte\myflyteapp> ..\python-3.8\Scripts\pyflyte.exe
Traceback (most recent call last):
  File "d:\flyte\python-3.8\lib\site-packages\scantree\compat.py", line 20, in <module>
    from posix import DirEntry
ModuleNotFoundError: No module named 'posix'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "d:\flyte\python-3.8\lib\runpy.py", line 194, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "d:\flyte\python-3.8\lib\runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "D:\flyte\python-3.8\Scripts\pyflyte.exe\__main__.py", line 4, in <module>
  File "d:\flyte\python-3.8\lib\site-packages\flytekit\clis\sdk_in_container\pyflyte.py", line 8, in <module>
    from flytekit.clis.sdk_in_container.fast_register import fast_register
  File "d:\flyte\python-3.8\lib\site-packages\flytekit\clis\sdk_in_container\fast_register.py", line 12, in <module>
    from flytekit.tools.fast_registration import compute_digest as _compute_digest
  File "d:\flyte\python-3.8\lib\site-packages\flytekit\tools\fast_registration.py", line 7, in <module>
    import dirhash as _dirhash
  File "d:\flyte\python-3.8\lib\site-packages\dirhash\__init__.py", line 13, in <module>
    from scantree import (
  File "d:\flyte\python-3.8\lib\site-packages\scantree\__init__.py", line 3, in <module>
    from ._path import (
  File "d:\flyte\python-3.8\lib\site-packages\scantree\_path.py", line 7, in <module>
    from .compat import (
  File "d:\flyte\python-3.8\lib\site-packages\scantree\compat.py", line 22, in <module>
    from scandir import scandir as _scandir
ModuleNotFoundError: No module named 'scandir'

Looking up the documentation for posix, its ingress starts with...

Do not import this module directly. Instead, import the module os, which provides a portable version of this interface. On Unix, the os module provides a superset of the posix interface. On non-Unix operating systems the posix module is not available, but a subset is always available through the os interface. Once os is imported, there is no performance penalty in using it instead of posix. In addition, os provides some additional functionality, such as automatically calling putenv() when an entry in os.environ is changed.

That might do it, ouch. I am willing to try volunteer to try fix this if possible, it is Hacktoberfest after all!

yindia commented 2 years ago

@samhita-alla We only need to test getting started first and then we can test other commands.

@martinlyra Awesome, @wild-endeavor Can help you with POSIX issue.

samhita-alla commented 2 years ago

@martinlyra This is uber-cool! We really appreciate the work you've been doing. :)

Regarding POSIX, we'll be more than happy if you could contribute. Let's see what's the possible fix!

martinlyra commented 2 years ago

I just need to know where I can find the right repository and folder, and maybe some know-how to test out the changes!

Would it be in the flytekit repo?

samhita-alla commented 2 years ago

Hi, @martinlyra! Yes, it has to be in the Flytekit repo, most probably, the pyflyte.py file. Can you join our Slack #hacktoberfest2021 channel? We can talk more about it there. There's already a thread I have started to initiate the discussion.

samhita-alla commented 2 years ago

You'd have to install Flytekit in development mode to test your changes as described in the contribution guide. You can then run pyflyte to check if it works as expected.

samhita-alla commented 2 years ago

Hi, @martinlyra! How's it going? Have you hit any roadblocks?

martinlyra commented 2 years ago

Have not met any roadblocks yet! Few errands popped up and I needed to go get them do done first, still got few more to do too. So my progress may be slower for this week.

martinlyra commented 2 years ago

As a summary for prosperity because much of the discussion was deliberated in the Slack. So I'll summarize a bit quickly here. The PR to fix the issue with pyflyte using a package not supported on Windows has been merged.

I've successfully registered the myapp example image with pyflyte then with flytectl. Which also successfully opened the graph of the example workflow in my browser.

I'll continue testing Windows compatibility and functions, though I'd like to tell that I am a full-time student, and I'll be having an exam week. Meantime, the progress will be slower whenever I can help.

samhita-alla commented 2 years ago

Thank you for your contribution, @martinlyra! You pretty much tested getting started on Windows. That should resolve this issue. However, I'm going to keep this open as we can test other utilities. Take your time in doing so!

And, can you fill in this form as part of Hacktoberfest?

martinlyra commented 2 years ago

Of course! I am only half way through the getting started guide! Better get things done than leaving them unfinished. I've submitted the form, although more may come before end of the October.

rozsasarpi commented 2 years ago

I could run the getting started code until Executing Workflows on a Flyte Cluster.

Running this command:

pyflyte run --remote example.py wf --n 500 --mean 42 --sigma 2

Lead to this error:

E0712 14:32:41.840000000 24152 src/core/tsi/ssl_transport_security.cc:1495] Handshake failed with fatal error SSL_ERROR_SSL: error:100000f7:SSL routines:OPENSSL_internal:WRONG_VERSION_NUMBER . E0712 14:32:41.849000000 24152 src/core/tsi/ssl_transport_security.cc:1495] Handshake failed with fatal error SSL_ERROR_SSL: error:100000f7:SSL routines:OPENSSL_internal:WRONG_VERSION_NUMBER . {"asctime": "2022-07-12 14:32:41,866", "name": "flytekit.cli", "levelname": "ERROR", "message": "Non-auth RPC error <_InactiveRpcError of RPC that terminated with:\n\tstatus = StatusCode.UNA VAILABLE\n\tdetails = \"failed to connect to all addresses\"\n\tdebug_error_string = \"{\"created\":\"@1657629161.869000000\",\"description\":\"Failed to pick subchannel\",\"file\":\"src/cor e/ext/filters/client_channel/client_channel.cc\",\"file_line\":3261,\"referenced_errors\":[{\"created\":\"@1657629161.869000000\",\"description\":\"failed to connect to all addresses\",\"fil e\":\"src/core/lib/transport/error_utils.cc\",\"file_line\":167,\"grpc_status\":14}]}\"\n>, sleeping 200ms and retrying"} {"asctime": "2022-07-12 14:32:42,085", "name": "flytekit.cli", "levelname": "ERROR", "message": "Non-auth RPC error <_InactiveRpcError of RPC that terminated with:\n\tstatus = StatusCode.UNA VAILABLE\n\tdetails = \"failed to connect to all addresses\"\n\tdebug_error_string = \"{\"created\":\"@1657629162.085000000\",\"description\":\"Failed to pick subchannel\",\"file\":\"src/cor e/ext/filters/client_channel/client_channel.cc\",\"file_line\":3261,\"referenced_errors\":[{\"created\":\"@1657629162.085000000\",\"description\":\"failed to connect to all addresses\",\"fil e\":\"src/core/lib/transport/error_utils.cc\",\"file_line\":167,\"grpc_status\":14}]}\"\n>, sleeping 400ms and retrying"} Traceback (most recent call last): File "C:\Users\rozsa\AppData\Local\Programs\Python\Python39\lib\runpy.py", line 197, in _run_module_as_main return _run_code(code, main_globals, None, File "C:\Users\rozsa\AppData\Local\Programs\Python\Python39\lib\runpy.py", line 87, in _run_code exec(code, run_globals) File "C:\Users\rozsa\working_folder\flyte_test\env\Scripts\pyflyte.exe__main.py", line 7, in File "C:\Users\rozsa\working_folder\flyte_test\env\lib\site-packages\click\core.py", line 1130, in call return self.main(*args, kwargs) File "C:\Users\rozsa\working_folder\flyte_test\env\lib\site-packages\click\core.py", line 1055, in main rv = self.invoke(ctx) File "C:\Users\rozsa\working_folder\flyte_test\env\lib\site-packages\click\core.py", line 1657, in invoke return _process_result(sub_ctx.command.invoke(sub_ctx)) File "C:\Users\rozsa\working_folder\flyte_test\env\lib\site-packages\click\core.py", line 1657, in invoke return _process_result(sub_ctx.command.invoke(sub_ctx)) File "C:\Users\rozsa\working_folder\flyte_test\env\lib\site-packages\click\core.py", line 1657, in invoke return _process_result(sub_ctx.command.invoke(sub_ctx)) File "C:\Users\rozsa\working_folder\flyte_test\env\lib\site-packages\click\core.py", line 1404, in invoke return ctx.invoke(self.callback, ctx.params) File "C:\Users\rozsa\working_folder\flyte_test\env\lib\site-packages\click\core.py", line 760, in invoke return callback(*args, *kwargs) File "C:\Users\rozsa\working_folder\flyte_test\env\lib\site-packages\flytekit\clis\sdk_in_container\run.py", line 516, in _run remote_entity = remote.register_script( File "C:\Users\rozsa\working_folder\flyte_test\env\lib\site-packages\flytekit\remote\remote.py", line 596, in register_script upload_location, md5_bytes = fast_register_single_script( File "C:\Users\rozsa\working_folder\flyte_test\env\lib\site-packages\flytekit\tools\script_mode.py", line 116, in fast_register_single_script upload_location = create_upload_location_fn(content_md5=md5) File "C:\Users\rozsa\working_folder\flyte_test\env\lib\site-packages\flytekit\clients\friendly.py", line 998, in get_upload_signed_url return super(SynchronousFlyteClient, self).create_upload_location( File "C:\Users\rozsa\working_folder\flyte_test\env\lib\site-packages\flytekit\clients\raw.py", line 42, in handler return fn(args, **kwargs) File "C:\Users\rozsa\working_folder\flyte_test\env\lib\site-packages\flytekit\clients\raw.py", line 859, in create_upload_location return self._dataproxy_stub.CreateUploadLocation(create_upload_location_request, metadata=self._metadata) File "C:\Users\rozsa\working_folder\flyte_test\env\lib\site-packages\grpc_channel.py", line 946, in call return _end_unary_response_blocking(state, call, False, None) File "C:\Users\rozsa\working_folder\flyte_test\env\lib\site-packages\grpc_channel.py", line 849, in _end_unary_response_blocking raise _InactiveRpcError(state) grpc._channel._InactiveRpcError: <_InactiveRpcError of RPC that terminated with: status = StatusCode.UNAVAILABLE details = "failed to connect to all addresses" debug_error_string = "{"created":"@1657629162.505000000","description":"Failed to pick subchannel","file":"src/core/ext/filters/client_channel/client_channel.cc","file_line":3261,"re ferenced_errors":[{"created":"@1657629162.505000000","description":"failed to connect to all addresses","file":"src/core/lib/transport/error_utils.cc","file_line":167,"grpc_status":14}]}"

python 3.9, fresh flytekit (v1.1.0) installation into a new virtual environment, Win 10

Any thoughts on resolving the issue?

samhita-alla commented 2 years ago

@rozsasarpi, you'll have to export the following two environment variables:

KUBECONFIG=$KUBECONFIG:/Users/samhitaalla/.kube/config:/Users/samhitaalla/.flyte/k3s/k3s.yaml
export FLYTECTL_CONFIG=/Users/samhitaalla/.flyte/config-sandbox.yaml

Make sure you modify the path to point to the correct flytectl config and k3s YAML file.

rozsasarpi commented 2 years ago

@samhita-alla thanks, adding the environment variables solved the problem! (I should have read the text printed to the command window...)

I successfully run all the Getting started code (flytekit and flytectl) using Windows.

The KUBECONFIG environmental variable that I ended up using (with your example):

export KUBECONFIG=/Users/samhitaalla/.flyte/k3s/k3s.yaml
github-actions[bot] commented 1 year ago

Hello 👋, This issue has been inactive for over 9 months. To help maintain a clean and focused backlog, we'll be marking this issue as stale and will close the issue if we detect no activity in the next 7 days. Thank you for your contribution and understanding! 🙏

github-actions[bot] commented 1 year ago

Hello 👋, This issue has been inactive for over 9 months and hasn't received any updates since it was marked as stale. We'll be closing this issue for now, but if you believe this issue is still relevant, please feel free to reopen it. Thank you for your contribution and understanding! 🙏

github-actions[bot] commented 2 months ago

Hello 👋, this issue has been inactive for over 9 months. To help maintain a clean and focused backlog, we'll be marking this issue as stale and will engage on it to decide if it is still applicable. Thank you for your contribution and understanding! 🙏