smi2 / phpClickHouse

php ClickHouse wrapper
MIT License
750 stars 139 forks source link

StreamRead uses a lot of memory #171

Closed Miriameri closed 2 years ago

Miriameri commented 2 years ago

When I try to get data from my big table, there is not enough memory. I am using StreamRead.

$streamRead = new StreamRead($stream);
$this->clickHouseManager->streamRead(
      $streamRead,
      'SELECT {columns} FROM {table_name} FORMAT CSVWithNames',
      [
             'table_name' => $tableName,
             'columns' => $columns,
      ]
);

In the public function execOne(CurlerRequest $request, $auto_close = false) after curl_exec($h); all data is buffered

isublimity commented 2 years ago

My example not use memory, stream is file ?


// Set limit memory 
ini_set('memory_limit', '2M');
echo 'Limit:'.ini_get("memory_limit")."\n";
echo 'Memory usage 0: ' . round(memory_get_usage() / 1024).' : '.round(memory_get_peak_usage() / 1024) . "kb \n";

// Connect 
$client = new ClickHouseDB\Client($config);

// Drop file if exists
@unlink('/tmp/ch_test.memory.limit');

// Create new file 
$stream = fopen('/tmp/ch_test.memory.limit','w+');

// Send data to file 
$streamRead=new ClickHouseDB\Transport\StreamRead($stream);

echo 'Memory usage 1: ' . round(memory_get_usage() / 1024).' : '.round(memory_get_peak_usage() / 1024) . "kb \n";

$r=$client->streamRead($streamRead,'SELECT sin(number) as sin,cos(number) as cos,cos(number) as cos1,cos(number) as cos2,cos(number) as cos4  FROM {table_name} LIMIT 4123111 FORMAT JSONEachRow', ['table_name'=>'system.numbers']);

echo 'Memory usage 2: ' . round(memory_get_usage() / 1024).' : '.round(memory_get_peak_usage() / 1024) . "kb \n";

// Echo 
echo "size_download:".($r->info()['size_download'])."\n";

fclose($stream);

echo 'Memory usage 3: ' . round(memory_get_usage() / 1024).' : '.round(memory_get_peak_usage() / 1024) . "kb \n";
Miriameri commented 2 years ago

isublimity Hi, I am using php://memory It is important for me that the data is stored in RAM, but i get error Error: Allowed memory size of XXX bytes exhausted Because curl_exec($h) takes up all memory

isublimity commented 2 years ago

curl_exec -> send to steam in php://memory => out of memory.

if memory_limit = 2M your upload 4M stream - get out of memory in function curl_exec. Try use file, or bath read -)

luckyraul commented 2 years ago

What if memory and disk size are negligible comparing to Clickhouse Data Select?

isublimity commented 2 years ago

you must change your data strategy. If they don't fit in memory use a different approach.

Like this:

  1. Read bath - LIMIT OFFSET
  2. Prepare data in CH -> group , having ...
  3. Temporary table in clickhouse servcer, INSERT INTO temp_table FROM SELECT big_table GROUP BY, in php read temp_table.
Miriameri commented 2 years ago

Found a solution

$streamRead->closure(function ($curl, $data) {
    echo $data;

    return \strlen($data);
});