matrixorigin / matrixone

Hyperconverged cloud-edge native database
https://docs.matrixorigin.cn/en
Apache License 2.0
1.78k stars 275 forks source link

[Bug]: mp3 file size 7.2M insert into blob failed #6302

Open heni02 opened 1 year ago

heni02 commented 1 year ago

Is there an existing issue for the same bug?

Environment

- Version or commit-id (e.g. v0.6.0 or 8b23a93):e983be5ad02b14da54754be5e64410446819f421
- Hardware parameters:
- OS type:
- Others:

Actual Behavior

image image

Expected Behavior

No response

Steps to Reproduce

create table blob_02(a int,b blob);
insert into blob_02 values(1,load_file('/Users/heni/Downloads/blobtest01.mp3'));
insert into blob_02 select 2,load_file('/Users/heni/Downloads/blobtest01.mp3');

Additional information

No response

JackTan25 commented 1 year ago

working on

jensenojs commented 1 year ago

According to @domingozhang , the size of our blob is 1gb. so i change blobsize in load_file.go and load a 7.75 MB mp3, works fine

jensenojs commented 1 year ago

will fix later

jensenojs commented 1 year ago

will fix later

jensenojs commented 1 year ago

https://github.com/matrixorigin/matrixone/issues/6824#issue-1458967168

估计要等这个做了

jensenojs commented 1 year ago

will fix later

jensenojs commented 1 year ago

依赖cn insert s3,现在在排查相关的bug

jensenojs commented 1 year ago

will fix later

JackTan25 commented 1 year ago

he is on spring festival

florashi181 commented 1 year ago

@heni02 since write s3 is finished, could u have a test again?

heni02 commented 1 year ago

@florashi181 @jensenojs insert failed commit id:756465a2a52adf7d0b5c67b06bc43d692d28bfb9 image image

jensenojs commented 1 year ago

The previous discussion is at this link, and some of its conclusions are outdated. I'll summarize the current state of the problem here.

  1. Directly adjusting the blobsize is not feasible, because mpool does not allow more than 1gb of memory allocation
  2. In the previous discussion it was mentioned that maybe for larger blob types we need to write it to s3, or raise the mpool limit.
    • Simply raising the upper limit of mpool doesn't actually solve the problem, because the size limit is on the vector, not on an element of the vector.
      • If there is a vector with a hundred blob objects, each blob can be 1gb, then the vector goes to more than a hundred gb. But the upper limit of size for all vectors is 1gb.
    • If we want to write blob to s3, then type blob need to be refactored.
      • The original blob is not very different from a string (just a larger upper limit), but now for an attribute (blob), its value may exist locally or on s3. More detailed design documentation is needed for this classification discussion.
matrix-meow commented 5 months ago

Hello @heni02. The bug issue in the BVT test code has not been removed,issues automatically open.