RelationalAI / rai-sdk-csharp

The RelationalAI Software Development Kit (SDK) for C#
Apache License 2.0
0 stars 1 forks source link

ListEngines example is throwing exception #26

Closed miazamrai closed 2 years ago

miazamrai commented 2 years ago

ListEngines example is throwing the following exception

RelationalAI.Errors.InvalidResponseException: Failed to deserialize response into type ListEnginesResponse
   at RelationalAI.Utils.Json`1.Deserialize(String data, String key) in /Users/iazam/dev/rai-sdk-csharp/RelationalAI/Utils/Json.cs:line 38
   at RelationalAI.Services.Client.ListEnginesAsync(Nullable`1 state) in /Users/iazam/dev/rai-sdk-csharp/RelationalAI/Services/Client.cs:line 181
   at RelationalAI.Examples.ListEngines.Run(Nullable`1 state, String profile) in /Users/iazam/dev/rai-sdk-csharp/RelationalAI.Examples/ListEngines.cs:line 36
   at System.CommandLine.NamingConventionBinder.CommandHandler.GetExitCodeAsync(Object returnValue, InvocationContext context)
   at System.CommandLine.NamingConventionBinder.ModelBindingCommandHandler.InvokeAsync(InvocationContext context)
   at System.CommandLine.Invocation.InvocationPipeline.<>c__DisplayClass4_0.<<BuildInvocationChain>b__0>d.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.CommandLine.Builder.CommandLineBuilderExtensions.<>c__DisplayClass18_0.<<UseParseErrorReporting>b__0>d.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.CommandLine.Builder.CommandLineBuilderExtensions.<>c__DisplayClass13_0.<<UseHelp>b__0>d.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.CommandLine.Builder.CommandLineBuilderExtensions.<>c__DisplayClass22_0.<<UseVersionOption>b__0>d.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.CommandLine.Builder.CommandLineBuilderExtensions.<>c__DisplayClass20_0.<<UseTypoCorrections>b__0>d.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.CommandLine.Builder.CommandLineBuilderExtensions.<>c.<<UseSuggestDirective>b__19_0>d.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.CommandLine.Builder.CommandLineBuilderExtensions.<>c__DisplayClass17_0.<<UseParseDirective>b__0>d.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.CommandLine.Builder.CommandLineBuilderExtensions.<>c.<<RegisterWithDotnetSuggest>b__6_0>d.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.CommandLine.Builder.CommandLineBuilderExtensions.<>c__DisplayClass9_0.<<UseExceptionHandler>b__0>d.MoveNext()

Environment to repro: Latest

miazamrai commented 2 years ago

Root cause: The list engines response is returning an engine with size XS/2 which is not an enum value in the EngineSize enumeration.

larf311 commented 2 years ago

We shouldn't need to update the SDKs just b/c there is a new engine size and we certainly shouldn't break existing SDKs.

Segflow commented 2 years ago

Why should the sizes be an enum? We moving to distributed engines where what we used to call Engine will become a cluster of engines.

Think of XS/2 being engine cluster with 2 node each being XS. There will be too many variant, XL/4, S/3 etc..

Creating an enum for all this is not scalable.

stanislav-bedunkevich commented 2 years ago

@Segflow would it be more natural to do this instead?

{
    "type": "Cluster",
    "engineSize": "XL",
    "clusterSize": 5
}
and
{
    "type": "Engine",
    "engineSize": "XL"
}

XL/4 feels confusing to denote "4 engines in a cluster"

stanislav-bedunkevich commented 2 years ago

Moreover, is this cluster thing user-facing? I.e. do users need to do anything at all working with it vs a single engine?

If so, it's a breaking change requiring a client update, so it's certainly not a problem to update the SDK?

Segflow commented 2 years ago

Changing the payload would break the api, something we cannot do right now. We understand that xs/4 is not ideal, but it allpw us to experiment with the feature without breaking other people workflow.

Reusing the same field to encode 2 different semantics allows us to roll out the feature faster.

I heard that in the future this will change and user will just say how many CPU they need.

NHDaly commented 2 years ago

Changing the payload would break the api

Is that actually true? Isn't it a non-breaking change to add a field? (And a breaking change to change the meaning of a field? -- Though in this case i still think the change made was pragmatic and reasonable 👍).

I would think that we should be able to add a field to the get_engine() response without worrying about breaking changes. I think that @whatsthecraic would also like us to report the number of workers anyway, since it would be useful for the benchmarking framework to know how many workers an engine had.

billscheidel-rai commented 1 year ago

Note: This issue has been migrated to https://relationalai.atlassian.net/browse/RAI-6831.

This link is only accessible to employees of RelationalAI.