cloudstateio / python-support

User Language Support for Python
Apache License 2.0
7 stars 5 forks source link

Implement stateless function support #42

Closed GratefulTony closed 4 years ago

GratefulTony commented 4 years ago

This implements stateless function support. A provisional stateless function TCK is included since to my knowledge, the cloudstate tck does not cover stateless functions yet. By default, executing ./extended_tck.sh will build a docker container, install cloudstate python-support into the docker container (including protobuf build), and run the original tck as well as the extended tck against said container.

Some changes to the build have been made to simplify installation, incorporating protobuf compilation into setup.py.

lightbend-cla-validator commented 4 years ago

Hi @GratefulTony,

Thank you for your contribution! We really value the time you've taken to put this together.

Before we proceed with reviewing this pull request, please sign the Lightbend Contributors License Agreement:

https://www.lightbend.com/contribute/cla

lightbend-cla-validator commented 4 years ago

Hi @GratefulTony,

Thank you for your contribution! We really value the time you've taken to put this together.

Before we proceed with reviewing this pull request, please sign the Lightbend Contributors License Agreement:

https://www.lightbend.com/contribute/cla

marcellanz commented 4 years ago

@GratefulTony thanks a lot for this contribution :) While reviewing it, we also like to get stateless function support in general in line with the current reference implementation. There are some open questions he project is on for this.

GratefulTony commented 4 years ago

Thanks for the review! I will look into making the requested changes this weekend. Also open to updating things in the future if nomenclature/best practices around stateless functions change.

pvlugter commented 4 years ago

Current plan is to rename stateless functions to Actions. They can certainly be renamed here later.

sleipnir commented 4 years ago

@pvlugter Is that the only change you intend to make?

lightbend-cla-validator commented 4 years ago

At least one pull request committer is not linked to a user. See https://help.github.com/en/articles/why-are-my-commits-linked-to-the-wrong-user#commits-are-not-linked-to-any-user

lightbend-cla-validator commented 4 years ago

At least one pull request committer is not linked to a user. See https://help.github.com/en/articles/why-are-my-commits-linked-to-the-wrong-user#commits-are-not-linked-to-any-user

GratefulTony commented 4 years ago

still have some work to do implementing the requested changes. Will get to it in the coming days. Moving to running unit tests for stateless functions. (can look into renaming to actions) fixed up the tck script to use unique container names, and made the networking more platform agnostic.

lightbend-cla-validator commented 4 years ago

At least one pull request committer is not linked to a user. See https://help.github.com/en/articles/why-are-my-commits-linked-to-the-wrong-user#commits-are-not-linked-to-any-user

lightbend-cla-validator commented 4 years ago

At least one pull request committer is not linked to a user. See https://help.github.com/en/articles/why-are-my-commits-linked-to-the-wrong-user#commits-are-not-linked-to-any-user

marcellanz commented 4 years ago

Hi @chomnoue,

Thank you for your contribution! We really value the time you've taken to put this together.

Before we proceed with reviewing this pull request, please sign the Lightbend Contributors License Agreement:

https://www.lightbend.com/contribute/cla

@GratefulTony I see now the typesafe-cla-validator mentioning "chomnoue" for commits you probably pushed. Also there are quite few unrelated commits I think visible in this branch. Could you rebase your changes/additions somehow cleanly to your branch? Regarding the CLA Validator if I remember correctly you got approved its check with your initial commit for this PR, or is this still pending? Thanks :)

marcellanz commented 4 years ago

Hi @chomnoue,

Thank you for your contribution! We really value the time you've taken to put this together.

Before we proceed with reviewing this pull request, please sign the Lightbend Contributors License Agreement:

https://www.lightbend.com/contribute/cla

@pvlugter can you imagine why the CLA Validator mentions chomnoue now in this PR? Thanks.

GratefulTony commented 4 years ago

Hi @chomnoue, Thank you for your contribution! We really value the time you've taken to put this together. Before we proceed with reviewing this pull request, please sign the Lightbend Contributors License Agreement: https://www.lightbend.com/contribute/cla

@GratefulTony I see now the typesafe-cla-validator mentioning "chomnoue" for commits you probably pushed. Also there are quite few unrelated commits I think visible in this branch. Could you rebase your changes/additions somehow cleanly to your branch? Regarding the CLA Validator if I remember correctly you got approved its check with your initial commit for this PR, or is this still pending? Thanks :)

Yeah. I got approved, but had to go back and rewrite some of my earlier commits since I had my email misconfigured in git config on a new dev machine. I'm in the middle of a rebase to try to clean things up. this is my best guess as to why chomnoue is getting flagged.

marcellanz commented 4 years ago

We discussed the renaming from stateful function to action today at the contributors call and we agreed to: "Python stateless function support can rename to action now or after merge".

So I think after the cleanup and review of this PR we should merge this PR and get the name changed afterwards.

sleipnir commented 4 years ago

@marcellanz I agree with merge first and rename later

GratefulTony commented 4 years ago

I think I have a good rebase now, CLA validator appears to be happy for now. still a tiny bit of work to do to get the docker/proxy-aware unit tests working.

marcellanz commented 4 years ago

I think I have a good rebase now, CLA validator appears to be happy for now. still a tiny bit of work to do to get the docker/proxy-aware unit tests working.

Thanks @GratefulTony sounds great. With about 4k lines removed and 1.5k added, can you please summarize shortly what has added, changed (and removed also) in this PR or what you were looking for. I assume that the PRs title "Implement stateless function support" describes not all changes for what I can see.

sleipnir commented 4 years ago

I think I have a good rebase now, CLA validator appears to be happy for now. still a tiny bit of work to do to get the docker/proxy-aware unit tests working.

Thanks @GratefulTony sounds great. With about 4k lines removed and 1.5k added, can you please summarize shortly what has added, changed (and removed also) in this PR or what you were looking for. I assume that the PRs title "Implement stateless function support" describes not all changes for what I can see.

When I reviewed this PR I saw that it included many formatting and style improvements. But it would be interesting to hear that summary.

GratefulTony commented 4 years ago

yes. the other changes are mainly to do with improving the build. I changed the build to automatically pull and compile the .proto files, I pull the proto files from scm during the setuptools build. I also removed the generated files from the git repo, as they are now generated when the user pip installs. This is the source of most of the removed code. Obviously, the tck script has been changed to run the extended tests. I also ran black, flake8 and isort on the library as there were some minor formatting and style issues.

I also refactored the test script packaging to be able to include the shoppingcart and functiondemo code in unit tests, and implemented some tests (more for functiondemo).

It looks like a new tck image was released last night, and now in the tck script the proxy isn't able to connect to the tck. I also still need to finally determine how to get discovery to work for the stateless functions without a persistence id. (it works fine with one)

marcellanz commented 4 years ago

yes. the other changes are mainly to do with improving the build. I changed the build to automatically pull and compile the .proto files, I pull the proto files from scm during the setuptools build. I also removed the generated files from the git repo, as they are now generated when the user pip installs. This is the source of most of the removed code. Obviously, the tck script has been changed to run the extended tests. I also ran black, flake8 and isort on the library as there were some minor formatting and style issues.

I also refactored the test script packaging to be able to include the shoppingcart and functiondemo code in unit tests, and implemented some tests (more for functiondemo).

@GratefulTony thanks for the summary.

It looks like a new tck image was released last night, and now in the tck script the proxy isn't able to connect to the tck. I also still need to finally determine how to get discovery to work for the stateless functions without a persistence id. (it works fine with one)

yeah, it failed for all other language support libraries: https://observablehq.com/@marcellanz/cloudstate-go-tck-verification

regarding discovery, if the protocol does not define this, it should be noted here on the Draft Spec PR: https://github.com/cloudstateio/cloudstate/pull/119

and we should discuss why no persistence ID should be used I think.

sleipnir commented 4 years ago

Without a doubt we need to formalize at Spec. But I believe that stateless functions do not need a persistence id since nothing persists

marcellanz commented 4 years ago

persistence

CRDTs are also not persisted (yet). And the persistence_id is defined as:

    // The ID to namespace state by. How this is used depends on the type of entity, for example,
    // event sourced entities will prefix this to the persistence id.
    string persistence_id = 3;

But I only mentioned the Spec, because its the container to collect such feedback I think.

sleipnir commented 4 years ago

Yes. I think that in the case of CRDT's there is an expectation that they will be persistent one day. I don't see that happening to the Stateless case, I don't see much sense. But I asked everyone's opinion at Spec

GratefulTony commented 4 years ago

regarding why the persistence_id was included in the implementation: as I had said, it appears that the service discovery mechanism is using an EntitySpec protocol buffer to participate in service description. The EntitySpec has the entities field (of type Entity) which requires the persistence_id field. In my implementation, I do not use the persistence_id to do service resolution, so as far as I can tell, it is a kind of baggage, but a field which I think needs to be populated on the Entity protocol buffer. I don't see any corresponding analogue for stateless functions in the protobuf specs. I wasn't sure if this was used in any way by cloudstate internals, but I set it to the service name to relate the user function to the namespace concept. In my implementation, I believe it does not actually need to be set to non-default, so I could potentially set this field in Entity protocol buffers to empty string, but this didn't seem to be the right thing to do. As far as I can tell, service discovery can't occur if I don't fill in one of these Entity objects for a stateless function definition.

marcellanz commented 4 years ago

It seems the proxy doesn't care ATM: https://github.com/cloudstateio/cloudstate/blob/v0.5.1/proxy/core/src/main/scala/io/cloudstate/proxy/function/StatelessFunctionSupportFactory.scala#L33-L41 just logs the entity.persistenceId and uses solely the entity.serviceName to instantiate the StatelessFunctionSupport. The spec clearly should state its requirements regarding the stateless function support.

sleipnir commented 4 years ago

There is clearly a gap between what was defined in the protobuf and what the proxy implements. But I think that to continue with this PR we can maintain the approach used by @GratefulTony just referring to the persistence id in the discovery stage

marcellanz commented 4 years ago

@sleipnir I Agree.

marcellanz commented 4 years ago

@sean-walsh would you like to have an eye on this community PR for the Python support? Thanks.

GratefulTony commented 4 years ago

So it seems we're ok with the "ride along" persistence_id and renaming to Actions later. I think other than that, all of the requested changes have been implemented. Let me know if there are other changes requested, otherwise, pr should be about ready to merge.

sleipnir commented 4 years ago

Thank you @GratefulTony I will take a look later. Again welcome to our community

marcellanz commented 4 years ago

I tested the PR locally (on a Mac) and ./extended_tck.sh with cloudstate-function` failed as follows:

Run starting. Expected test count is: 5
ConfiguredCloudStateTCK:
Cloudstate TCK 
- must verify that the user function process responds
- must verify that an initial GetShoppingCart request succeeds
- must verify that items can be added to, and removed from, a shopping cart
- must verify that the backend supports the ServerReflection API
- must verify that the HTTP API of ShoppingCart protocol works
[ERROR] [09/23/2020 10:02:03.825] [CloudStateTCK-akka.actor.default-dispatcher-7] [akka://CloudStateTCK/system/pool-master] connection pool for Pool(shared->http://cloudstate-proxy-3hii9jedwx:9000) has shut down unexpectedly
java.lang.IllegalStateException: Pool shutdown unexpectedly
        at akka.http.impl.engine.client.PoolInterface$Logic.postStop(PoolInterface.scala:214)
        at akka.stream.impl.fusing.GraphInterpreter.finalizeStage(GraphInterpreter.scala:599)
        at akka.stream.impl.fusing.GraphInterpreter.finish(GraphInterpreter.scala:324)
        at akka.stream.impl.fusing.GraphInterpreterShell.tryAbort(ActorGraphInterpreter.scala:664)
        at akka.stream.impl.fusing.ActorGraphInterpreter.$anonfun$postStop$1(ActorGraphInterpreter.scala:802)
        at akka.stream.impl.fusing.ActorGraphInterpreter.$anonfun$postStop$1$adapted(ActorGraphInterpreter.scala:802)
        at scala.collection.immutable.Set$Set2.foreach(Set.scala:201)
        at akka.stream.impl.fusing.ActorGraphInterpreter.postStop(ActorGraphInterpreter.scala:802)
        at akka.actor.Actor.aroundPostStop(Actor.scala:554)
        at akka.actor.Actor.aroundPostStop$(Actor.scala:554)
        at akka.stream.impl.fusing.ActorGraphInterpreter.aroundPostStop(ActorGraphInterpreter.scala:691)
        at akka.actor.dungeon.FaultHandling.finishTerminate(FaultHandling.scala:240)
        at akka.actor.dungeon.FaultHandling.terminate(FaultHandling.scala:197)
        at akka.actor.dungeon.FaultHandling.terminate$(FaultHandling.scala:167)
        at akka.actor.ActorCell.terminate(ActorCell.scala:410)
        at akka.actor.ActorCell.invokeAll$1(ActorCell.scala:519)
        at akka.actor.ActorCell.systemInvoke(ActorCell.scala:535)
        at akka.dispatch.Mailbox.processAllSystemMessages(Mailbox.scala:295)
        at akka.dispatch.Mailbox.run(Mailbox.scala:230)
        at akka.dispatch.Mailbox.exec(Mailbox.scala:243)
        at java.base/java.util.concurrent.ForkJoinTask.doExec(Unknown Source)
        at java.base/java.util.concurrent.ForkJoinPool$WorkQueue.topLevelExec(Unknown Source)
        at java.base/java.util.concurrent.ForkJoinPool.scan(Unknown Source)
        at java.base/java.util.concurrent.ForkJoinPool.runWorker(Unknown Source)
        at java.base/java.util.concurrent.ForkJoinWorkerThread.run(Unknown Source)

Run completed in 8 seconds, 486 milliseconds.
Total number of tests run: 5
Suites: completed 1, aborted 0
Tests: succeeded 5, failed 0, canceled 0, ignored 0, pending 0
All tests passed.
Removing docker containers
cloudstate-proxy-3hII9jEdWX
cloudstate-function-3hII9jEdWX
69f64237cde10d2dd02b37536a3ca0c0489eee12bdf1ab9e6c31df2c35d94925
5406ffa2baeb6804187a996ce37bde0e6b5c306563a6e0669cc17c8dd2059c53
2020-09-23 10:02:26,143 - test_shoppingcart.py - INFO: host: cloudstate-proxy-3hII9jEdWX
2020-09-23 10:02:26,143 - test_shoppingcart.py - INFO: port: 9000
2020-09-23 10:02:26,143 - test_shoppingcart.py - INFO: connecting on cloudstate-proxy-3hII9jEdWX:9000
2020-09-23 10:02:26,423 - test_shoppingcart.py - INFO: resp: 
2020-09-23 10:02:26,476 - test_shoppingcart.py - INFO: resp: items {
  product_id: "0"
  name: "beer"
  quantity: 24
}

2020-09-23 10:02:26,477 - test_functiondemo.py - INFO: connecting on cloudstate-proxy-3hII9jEdWX:9000
2020-09-23 10:02:26,477 - test_functiondemo.py - INFO: channel established.
2020-09-23 10:02:26,527 - test_functiondemo.py - INFO: resp: bar: "foo"

2020-09-23 10:02:26,542 - test_functiondemo.py - INFO: resp: bar: "foo!"

Traceback (most recent call last):
  File "/usr/local/lib/python3.8/runpy.py", line 192, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/usr/local/lib/python3.8/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/usr/local/lib/python3.8/site-packages/cloudstate/test/tck_services.py", line 33, in <module>
    evaluate_functiondemo_server(server, server_port)
  File "/usr/local/lib/python3.8/site-packages/cloudstate/test/functiondemo/test_functiondemo.py", line 44, in evaluate_functiondemo_server
    stub2.ReverseString2(request_boom2)
  File "/usr/local/lib/python3.8/site-packages/_pytest/python_api.py", line 744, in __exit__
    fail(self.message)
  File "/usr/local/lib/python3.8/site-packages/_pytest/outcomes.py", line 156, in fail
    raise Failed(msg=msg, pytrace=pytrace)
Failed: DID NOT RAISE <class 'Exception'>
cloudstate-proxy-3hII9jEdWX
cloudstate-function-3hII9jEdWX
tck-network-3hII9jEdWX
GratefulTony commented 4 years ago

thanks @marcellanz The changes to the script should be easy. Regarding the TCK, I noticed things weren't working as well when a new version of the TCK container was pushed a few days ago. I will try to get to the bottom of what's going on with the TCK.

marcellanz commented 4 years ago

thanks @marcellanz The changes to the script should be easy. Regarding the TCK, I noticed things weren't working as well when a new version of the TCK container was pushed a few days ago. I will try to get to the bottom of what's going on with the TCK.

@GratefulTony It would be interesting from where the Pool shutdown unexpectedly is coming; I've seen this error on other occasions with other Support Libraries (might be a proxy issue). But as the shopping cart tests succeeded: Tests: succeeded 5, failed 0, canceled 0, ignored 0, pending 0

I made my comment more about this one: Failed: DID NOT RAISE <class 'Exception'>

which seems to come from the new stateless function tests. Thanks for looking into :)

GratefulTony commented 4 years ago

@marcellanz I see. Thanks for pointing that out. Interesting that you don't appear to be using the latest tck image, which iirc, runs 23 tests? I may also need to check that the tck script pulls the latest image version...

It also seemed that there might be an issue with the latest tck... is this why you were using the previous version?

marcellanz commented 4 years ago

@marcellanz I see. Thanks for pointing that out. Interesting that you don't appear to be using the latest tck image, which iirc, runs 23 tests? I may also need to check that the tck script pulls the latest image version...

great point! I think I have still an old image on my box.

sleipnir commented 4 years ago

I think we can only consider the oldest tck to validate this PR.

marcellanz commented 4 years ago

I think we can only consider the oldest tck to validate this PR.

@sleipnir Agree. The last few days too much has changed on cloudstate master. We could use cloudstateio/cloudstate-tck:0.5.1.

@GratefulTony the "old" tck script had also the options to specify alternative docker images used: https://github.com/cloudstateio/python-support/blob/v0.1.1/tck/run_tck.sh#L12

GratefulTony commented 4 years ago

Requested changes made to TCK shell script. Also pinned TCK image version as suggested. TCK passes for me on mac and linux.

I tested on two separate macs with clean clones of the repo and wasn't able to reproduce the failed test case.

sleipnir commented 4 years ago

I ran on my linux machine and saw no problems. I approve this PR

marcellanz commented 4 years ago

@GratefulTony thanks again for your contribution and efforts on this PR :)

GratefulTony commented 4 years ago

Thanks everyone. Excited for the future of this project.

pvlugter commented 4 years ago

🎉 👍