tomasfabian / ksqlDB.RestApi.Client-DotNet

ksqlDb.RestApi.Client is a C# LINQ-enabled client API for issuing and consuming ksqlDB push and pull queries and executing statements.
MIT License
97 stars 26 forks source link

Specify IgnoreByInserts using the model builder #90

Closed tomasfabian closed 1 month ago

tomasfabian commented 1 month ago

To exclude fields not supported for INSERT statements such as HEADERS using the model builder, a new function IgnoreByInserts should be introduced as shown below.

private record KeyValuePair
{
  public string Key { get; set; } = null!;
  public byte[] Value { get; set; } = null!;
}

private record Record
{
  public string Message { get; set; }
  public KeyValuePair[] Headers { get; init; } = null!;
}
using ksqlDb.RestApi.Client.FluentAPI.Builders;
using ksqlDB.RestApi.Client.KSql.Query;
using ksqlDB.RestApi.Client.KSql.RestApi.Enums;
using ksqlDB.RestApi.Client.KSql.RestApi.Statements.Properties;

ModelBuilder modelBuilder = new();

modelBuilder.Entity<Record>()
        .Property(b => b.Headers)
        .AsStruct()
        .WithHeaders()
        .IgnoreByInserts();

var record = new Record { Message = "Hello" };

// CreateInsert is internally used by KSqlDbRestApiClient.ToInsertStatement
var statement = new CreateInsert(modelBuilder).Generate(record, new InsertProperties { IdentifierEscaping = escaping });
Console.WriteLine(statement);

The above-generated statement should result in the following output:

INSERT INTO Records (Message) VALUES ('Hello');

This proposal is based on insights from @mrt181 in issue #89.

mrt181 commented 1 month ago

I think I would set IgnoreByInserts automatically when using WithHeaders because this must be set together to work anyway.

This immediately prevents a malformed insert statement in case someone forgets to add IgnoreByInserts.

That error would be only visible when a test checks inserts or during runtime when an app tries to do that, which could be late in an apps lifetime.

tomasfabian commented 1 month ago

I'm considering introducing a PseudoColumn() function instead of IgnoreByInserts() (or perhaps both options could coexist, as outlined below), which would exclude fields/properties with this configuration from INSERT statements. The PseudoColumn attribute is already used by the IdentifierUtil, while the IgnoreByInserts attribute is necessary for ignoring fields during INSERT INTO statement generation.

The fluent API already contains an Ignore() function that excludes a field from all DDL or KSQL statements. I need to double-check this, but EntityInfo shouldn't contain this line, since it's also a base class for CreateEntity. Instead, this line should be added in an overridden method within CreateInsert.

Using WithHeaders would automatically mark the column as a pseudo column, as you suggested above. These pseudo columns should be automatically excluded from INSERT statements, right? Do you still see any use cases for the IgnoreByInserts attribute, or should we consider merging it with the PseudoColumn attribute?

So the question is: do we need to separately mark fields to be ignored during DML statement generation, while still allowing them to be used in KSQL or DDL statements? If this applies only to pseudo columns, the IgnoreByInserts attribute could be omitted. However, if we retain the IgnoreByInserts attribute, should the Ignore() function in the fluent API also exclude fields from DMLs?

Sorry if this sounds too confusing. I would like to consolidate these options in the most optimal way.

Thank you @mrt181!

mrt181 commented 1 month ago

PseudoColumn marks ksqlDB provided columns. https://docs.ksqldb.io/en/latest/reference/sql/data-definition/#pseudocolumns

But I find the docs misleading, because you have to define a column that is marked as HEADERS or HEADER('key').

The docs also state:

You can't create additional pseudocolumns beyond these.

The IdentifierUtil checks for the PseudoColumn attribute to determine if to escape the column name from a query, which it shouldn't do for those columns.

You already provide a Record type that contains the PseudoColumns.

I might have added a HEADERS property there 😖, following the docs but I now understand that those are not correct. It should be removed.

Maybe provide a baked in HEADERS type that contains the right property definitions for key and value and WithHeaders just has to check that it's that type. Maybe source generators can be used on such a type?

Than add a WithHeader(string key) extension for the HEADER('key') case.

Ignore in DDL removes it from the schema, as a consequence of that it can't be used in DML because the column doesn't exist, right?

IgnoreByInsert is only concerned with DML.

I think they should both exist, the first one should apply to DDL and DML.

The second just to DML.

Maybe they should be renamed?

Are good candidates imo.

tomasfabian commented 1 month ago

You are right. I marked the Headers property as obsolete in the Record type.

In order to avoid a breaking change, the Ignore function could remain as it is, with the behavior exactly as described above by you:

in DDL removes it from the schema, as a consequence of that it can't be used in DML because the column doesn't exist

I also introduced a new IgnoreAttribute.

Thank you very much for your insights so far! Could you also review this PR #92 when you have some time?

Omission from DML can be configured using the IgnoreInDML function. Of course, let me know if the name isn't suitable.

If the need arises to include the configuration of pseudo columns in the fluent API, feel free to open an issue.