Closed asavaritayal closed 4 years ago
+1 Absolutely needed. Working on a small sample with asyncio to simulate something similar.
+1
+1 Yes needed! Python has many choice of data science libraries and it definitely has long running & stateful scenario!
I am experimenting with triggering an azure batch job from an azure function. So this is completely asynchronous and you can also scale up the compute ressources easily in azure batch.
+1 Absolutely
+1 for long running analytics/ML jobs
+1
This is absolutely a needed feature. I do have a customer who uses Python in Azure Functions for manipulating CSVs & creating time series with Pandas and they are an ideal case for function chaining in DF.
+1 Happy to contribute
/cc @cgillum @kashimiz as FYI
+1 Need it asap :)
+1
currently running large data transformations with Python. Execution time is around 30 min. Having durable functions will help a lot!
I'm saddened to see that Python support for Durable Functions isn't available yet. I've chronicled my efforts to make my usage of the HTTPTrigger more effective/efficient for my workflow in the comments of issue #236. The documentation, however, keeps suggesting that I take the Durable Functions route for a more reasonable experience.
Unfortunately, it looks as though I'll need to re-implement Durable Functions manually if I want to gain the suggested benefits of that approach :(
+1, indeed.
+1, of course
+1, This would be very useful for long running ML Workloads
Absolutely +1.
+1 agreed. We could use this now.
+1 - Is there a timeline for when this will be available?
+1 - Definitely needed
+1 - Absolutely needed!
+1 - this is a MUST HAVE!
In the mean time - @asavaritayal is it possible to use javascript durable functions for a python function app?
This could pretty much replace spark for me, and make the dev cycle much faster because i personally find spark is much harder to write and debug than normal python (java stack traces inside python ones)
Also good for the kind of stuff i would do in powershell, but much prefer python as a scripting language
+1
+1 - Absolutely needed!
Happy to contribute and looking for some update on this. Not sure whether this repo https://github.com/kashimiz/azure-functions-durable-python is official for Durable functions
Oh please, please, PLEASE. I've been trying to do some things in Node and there are no libraries to handle what I need.
This is truly needed for production deployment - if only to catch container errors. Equivalent functionality is available as step functions elsewhere. If there is an alternative way to catch container errors (memory limit, timeout) that would solve the basic problem.
+1 100% Absolutely necessary! Especially for long running ETL and data processing.
Looking forward to contribute to the functionality whenever possible
I guess this is not going anywhere even after a year ! :(
As @Sarah-Aly I do have some use case where pandas is used to manipulate multiple DataFrame and where workflow could be simplified with Durable Function.
+3 (voting on behalf of my lazy colleagues)
We use Python Functions for small ETL processes with pandas. Durable functions would help to very elegantly structure the individual steps.
I wrote a small python package that handles chaining for me... It transparently sends messages to queues, which preceeding jobs listen to, and monitors the execution status in an storage-account table.
All in all it works, but it feels real dirty :-(
I wrote a small python package that handles chaining for me... It transparently sends messages to queues, which preceeding jobs listen to, and monitors the execution status in an storage-account table.
All in all it works, but it feels real dirty :-(
Can you share it? I'm really interested in how this might be achieved, and quick-and-dirty code is nothing to be ahsamed of =)
+1
++ Yes, please!
Can you share it? I'm really interested in how this might be achieved, and quick-and-dirty code is nothing to be ahsamed of =)
See gist. Source for jobmanager + short readme + usage examples.
https://gist.github.com/KonoMaxi/b63f184bad7ffccbdcc4d818da7b6ee9
+1
Thank you very much for your interest. I can confirm that we have started working on this and based on current estimates it looks like we should be able to get a beta out sometime early next calendar year. Please stay tuned for more udpates and we would love for people on this thread to be able to give us feedback.
See gist. Source for jobmanager + short readme + usage examples.
https://gist.github.com/KonoMaxi/b63f184bad7ffccbdcc4d818da7b6ee9
Awesome! Thanks a lot! (Also - your code is not bad or dirty at all!!!)
+1 ! happy to contribute if needed. Actually i was thinking to use JS for Durable Functions and trigger my python non-durable functions from it with HTTP requests, anyone have used this or a better approach? Thanks in advance!
As @anirudhgarg mentioned, we've started work on supporting Durable Functions. We would love to hear more details about what patterns you plan on using and what you'll be doing with it.
As @anirudhgarg mentioned, we've started work on supporting Durable Functions. We would love to hear more details about what patterns you plan on using and what you'll be doing with it.
Hi Anthony, for my use case, i am calling Azure Function in ADF for some data processing related task . So two patterns i am plan on using is Async API and function chaining. Mainly to avoid the 2.5 minute API timeout as the file is too big or the job takes longer time to finish
As @anirudhgarg mentioned, we've started work on supporting Durable Functions. We would love to hear more details about what patterns you plan on using and what you'll be doing with it.
I'd be looking at the usual patterns that I'd do with Durable Functions in C#/JS but being able to take advantage of Python's much larger ecosystem of ML/AI libraries - e.g. loading documents and then running NLP models over it to extract information before sending the results on to another service/human. To me it will also move a few workloads from DataBricks PySpark - generally filling the gap where the demands are beyond an individual function but really don't need the firepower & faff of a full Spark Cluster - bringing improved reliability, more straightforward & dependable programming and serverless operation.
It'd be interesting to see how parallel training of an ML model could be achieved using something like durable entities combined with overriding the threading behaviour and instead spinning up activity functions. No idea if that would work or if the latency cost would be too high but would be cool to investigate.
We could also use it for the same patterns as Daniel mentioned. There’s a ML pattern called the Rendezvous pattern that this would be helpful for.
On Sat, Nov 23, 2019 at 6:52 AM Daniel Bass notifications@github.com wrote:
As @anirudhgarg https://github.com/anirudhgarg mentioned, we've started work on supporting Durable Functions. We would love to hear more details about what patterns https://docs.microsoft.com/en-us/azure/azure-functions/durable/durable-functions-overview you plan on using and what you'll be doing with it.
I'd be looking at the usual patterns that I'd do with Durable Functions in C#/JS but being able to take advantage of Python's much larger ecosystem of ML/AI libraries - e.g. loading documents and then running NLP models over it to extract information before sending the results on to another service/human. To me it will also move a few workloads from DataBricks PySpark - generally filling the gap where the demands are beyond an individual function but really don't need the firepower & faff of a full Spark Cluster - bringing improved reliability, more straightforward & dependable programming and serverless operation.
It'd be interesting to see how parallel training of an ML model could be achieved using something like durable entities combined with overriding the threading behaviour and instead spinning up activity functions. No idea if that would work or if the latency cost would be too high but would be cool to investigate.
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/Azure/azure-functions-python-worker/issues/227?email_source=notifications&email_token=ACMHXIVUDKOAXJJFKFEARUDQVEKQNA5CNFSM4F6IKWZ2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEE7TSNQ#issuecomment-557791542, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACMHXISGNDFIDBR6VKZPXR3QVEKQNANCNFSM4F6IKWZQ .
Chaining, Async API and Long-running functions are the most needed in our project, Thanks!
P.S Is there a way to be involved in the project early-on? Would really like to participate and contribute to it
In my project, we would use the Fan out/fan in pattern to orchestrate the execution of different Functions which extract data from different sources, then it's evaluated to check for security settings and finally it's loaded to a database. Actually we're doing this orchestration with logic apps but we would prefer to move to ADF to get advantage of stateful and easy to maintain than logic apps.
I would love to do a similar pattern with Python on Linux Functions and add some statistical processing with scipy. Eventually, even evaluating ensembles of Tensorflow models...
On 27 Nov 2019, at 10:21, Korven Dallas notifications@github.com wrote:
In my project, we would use the Fan out/fan in pattern to orchestrate the execution of different Functions which extract data from different sources, then it's evaluated to check for security settings and finally it's loaded to a database. Actually we're doing this orchestration with logic apps but we would prefer to move to ADF to get advantage of stateful and easy to maintain than logic apps.
— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or unsubscribe.
Potential workaround: Create a python function app with a custom linux image or host it on a premium plan (I think this switches billing to cost per sec vs cost per execution). This should remove the timeout right?
Then within that function app use async requests and manually call the functions you need for your distributed computing. The custom image on premium will run as long as needed and will be able to orchestrate the calls.
Would this not work?
@tjhgit
I am experimenting with triggering an azure batch job from an azure function. So this is completely asynchronous and you can also scale up the compute ressources easily in azure batch.
@priyaananthasankar
+1 Absolutely needed. Working on a small sample with asyncio to simulate something similar.
Any luck in these adventures?
@mattc-eostar not really. Depends on how you trigger your function. If you use http trigger - then you are limited by 4 minutes timeout. Function should execute and return output within this limit, otherwise it fails. And if you want to trigger your function from Data Factory (for example) - you are pretty much limited to http triggers.
New Feature - Looking for votes and/or user input to gauge traction.