Closed dsyme closed 9 years ago
I added a CloudFile sample as well. @palladin and @eiriktsarpalis please take a look - I think some of the MBrace API is changing in this regard.
BTW please advise about how to teach people about partitioning cloud arrays.
The best way to read data from a cloud file is https://github.com/elastacloud/mbrace-on-brisk-starter/blob/master/src/Demos/4-parallel-web-download.fsx#L48
Is there any reason to use .NET Task rather than the mbrace Process? Process gives you all information around the process including running time etc. rather than pushing it into a Task where you lose a lot of the details.
Looks good. The brisk version seems to be lacking the client API for creating cloud refs without sending work to the cluster. This will be fixed as soon as push the MBrace.Azure package.
Another example I can think of is giving an example with ICloudDisposable
. For instance,
cloud {
use! data = CloudRef.New [| 1 .. 10000000 |]
let! results = Array.init 10 (fun i -> doWork i data) |> Cloud.Parallel
return (data, results) // return the data cloud ref to verify it has been disposed from store
}
I'm just using Task because that's what I've got used to. I'll try Process and then adjust systematically
@isaacabraham @dsyme The latest release of MBrace.Core
comes with a cloud task abstraction. It's essentially like the process type but can be created and consumed by cloud workflows. It should replace any occurrence of System.Threading.Task
in the client API.
@eiriktsarpalis: When would you use CloudTask over Cloud? @dsyme: cool - I think moving to Process will be better insofar as provide a simpler experience for reasoning about what's happening to a job that is submitted.
@eiriktsarpalis - OK, great. I'll switch to CreateProcess for now.
@isaacabraham - any ETA when we can upgrade the Brisk cluster creation to use a newer MBrace.Core? It looks like there are lots of good improvements to the programming model in the works.
@eiriktsarpalis - what is the "Run" method? cluster.RunAsTask returning an ICloudTask?
@isaacabraham - actually I'll leave it as RunAsTask for until @eiriktsarpalis advises which is the most stable, preferred option in the light of the upcoming API improvements.
sounds like a plan.
@eiriktsarpalis: is there anywhere (aside from your summary on the google group) where there's a slightly lower-level view of upcoming features / changes?
@eiriktsarpalis: we can do it whenever we want. I'm going to look to expand the roles offered anyway to more than just medium workers so we can do it at the same time. If you can label the appropriate repo (or let me know which nuget packages to get) then I can build it no prob. Also - are we working off the lib folder approach or moving to nuget packages?
@isaacabraham A CloudTask<_> denotes an executing computation in the cluster. Cloud<_> is just a deferred workflow that can be executed arbitrary times. Pretty similar to the differences between Task<_> and Async<_>. Unfortunately there is currently no outline of the programming model features, implemented or planned. This must be addressed soon.
@dsyme We're still working in incorporating the latest core version with MBrace.Azure. Most of the work is done, we're a few failing unit tests away from completion. A CloudTask can be started either from the cluster client or using the Cloud.StartAsCloudTask : Cloud<'T> -> Cloud<ICloudTask<'T>>
primitive within a cloud workflow. In the brisk bits, this can be somehow achieved using the Cloud.StartChild : Cloud<'T> -> Cloud<Cloud<'T>>
primitive, but this is not quite as flexible.
@eiriktsarpalis OK, cool thanks for the update, that sounds great (I do wonder if ICloudTask should be CloudTask since it's strange to have the "I" suffix appear in the programming like that for the first time)
In this context we were asking about the best "Run" method to standardize on when scripting on the local machine - CreateProcess, RunAsTask, RunAsync. It looks like we're using CreateProcess for now :)
@dsyme unless you just want to get the result back immediately - then just call Run :-)
@dsyme I would expect that Process<_>
is going to implement the ICloudTask<_>
interface, so CreateProcess and RunAsTask should merge. Run is probably the best choice for a quick hello world-like demo, but in practice CreateProcess is the best choice when expecting to do some sort of debugging.
That's how I've been using Run() - for demos in .fsx files. For a production system you would undoubtedly want to have some job monitoring involved.
Added a sample to upload data as blobs using CloudRef and CloudArray