DarkWanderer / ClickHouse.Client

.NET client for ClickHouse
MIT License
315 stars 62 forks source link

Error when executing a bulk insert - Value was either too large or too small for a UInt16. #455

Closed Bortolazzi closed 6 months ago

Bortolazzi commented 6 months ago

I am performing an insertion test to check the feasibility of migrating the analytical database.

I created a test table to create a data load. I tested with 1,000 records - ok I tested with 10,000 records - ok

When I went to the 100,000 records test I received the following error:

Exception Message: Error when serializing data

Exception Source: ClickHouse.Client

Exception TargetSite: System.IO.Stream SerializeBatch(Batch)

Inner Exception:
Exception Message: Value was either too large or too small for a UInt16.

Exception Source: System.Private.CoreLib

Exception TargetSite: Void ThrowUInt16OverflowException()

Exception StackTrace:    at System.Convert.ThrowUInt16OverflowException()
   at ClickHouse.Client.Types.DateType.Write(ExtendedBinaryWriter writer, Object value)
   at ClickHouse.Client.Copy.ClickHouseBulkCopy.SerializeBatch(Batch batch)

This is how my table was created:

CREATE TABLE wan_test
(
    `id` UUID,
    `name` String NOT NULL,
    `birth_date` Date NULL, 
    `is_active` Boolean NOT NULL, 
    `wage` Decimal(18,2) NOT NULL,
    `age` INT NOT NULL, 
    `created_at` DateTime DEFAULT now()
)
ENGINE = MergeTree
PRIMARY KEY id
ORDER BY id
SETTINGS index_granularity = 8192;

This is how my class is in c#:

[Table("wan_test")]
public class TableTestEntity 
{
    public TableTestEntity() { }

    [Column("id")]
    public Guid Id { get; set; }

    [Column("name")]
    public string Name { get; set; }

    [Column("birth_date")]
    public DateTime? BirthDate { get; set; }

    [Column("is_active")]
    public bool IsActive { get; set; }

    [Column("wage")]
    public decimal Wage { get; set; }

    [Column("age")]
    public int Age { get; set; }

    [Column("created_at"), DatabaseGenerated(DatabaseGeneratedOption.Computed)]
    public DateTime CreatedAt { get; set; }
}

I created a function to create and populate records dynamically, based on a for loop:

List<TableTestEntity> GenerateDataToInsert(int quantity)
{
    var response = new List<TableTestEntity>();

    var birthDate = new DateTime(1992, 10, 28);

    for (int i = 0; i < quantity; i++)
    {
        response.Add(new TableTestEntity()
        {
            Id = Guid.NewGuid(),
            Name = $"test {i}",
            BirthDate = (i % 2) == 0 ? birthDate.AddDays(i) : null,
            Age = i,
            IsActive = true,
            Wage = decimal.Round(1.12m * i, 2, MidpointRounding.AwayFromZero),
            CreatedAt = DateTime.MinValue
        });
    }

    return response;
}

I carried out a test inserting the records in batches and it worked, however, when I use bulk insert I receive the error reported above.

Can you tell me if it's a bug or if somewhere in the code I did something stupid?

Thank you very much in advance for your attention.

DarkWanderer commented 6 months ago

ClickHouse Date value has range of [1970-01-01, 2149-06-06]. 100000 days is 274 years, so you end up with BirthDate of year 2256 - which is outside the range

Bortolazzi commented 6 months ago

Wow, thank you very much, I hadn't noticed that.