cetz-package / cetz

CeTZ: ein Typst Zeichenpaket - A library for drawing stuff with Typst.
https://cetz-package.github.io
GNU Lesser General Public License v3.0
733 stars 34 forks source link

High memory usage when plotting #614

Closed miasb closed 3 weeks ago

miasb commented 1 month ago

Hi,

i'm trying to plot some CSV files with ~60 kiB using the cetz.plot utility. I get ~7 GiB RAM usage for those ~2300 x-y tuples.

To read the files i use the built-in CSV utility and convert all string tuples to floats.

#let csv_to_data(path: none, scaling: 1.0) = {
  let arr = csv(path, delimiter: " ")

  let vals = ()
  for i in arr {
    vals.push( ( float(i.at(0)), float(i.at(1))*scaling ) ) 
  }

  return vals
}

This part works well and takes a few miliseconds. When i try to plot the resulting array, it takes about 7 GiB RAM and over a minute of processing time on my laptop.

This is my plotting code:

    plot.plot(
      size: (2.8, 2.8), x-min:-13, x-max:-6,
    {  
      plot.add(
        csv_to_data(path: "Data/G0W0_PBE/CP2K/Only_HOMO/0_O3_PBE/self_energy_re.dat"),
        style: (stroke: 2pt + blue),
      )
    })

Is this to be expected?

Thank you in advance, Mia

johannes-wolf commented 1 month ago

Hi, I am able to reproduce extreme RAM consumption by cetz.plot with a 250000 row CSV, but not with only 2500 rows.

I will look into this, but it is not that easy to find out where this huge allocations happen in Typst code.

Plotting thousands of tuples into a PDF might not be a good Idea, as it will slow down the renderer of the reader, too. Can you try setting line: "raw" here plot.add(.., line: "raw")? Can you share the CSV file?

miasb commented 1 month ago

Hi,

thanks for your answer. I attached my CSV file. Adding the line: "raw" option didn't help for my case unfortunately.

self_energy_re.csv

johannes-wolf commented 1 month ago

Thank you, it seems to be a bug/inefficiency in my naive clipping code. Without the axis limits (x-min/x-max) it works fine. A workaround would be calling .filter(((x,y)) => x >= -13 and x <= -6) on the data.

miasb commented 1 month ago

Thank you for your answer. I added filtering to my csv_to_data function and it works flawlessly. The initial compilation for my 6 plot figure takes ~2s, but incremental compilation using typst watch goes down to ~20ms after that. Do you plan to implement some kind of csv -> data function?

johannes-wolf commented 1 month ago

Thank you for your answer. I added filtering to my csv_to_data function and it works flawlessly. The initial compilation for my 6 plot figure takes ~2s, but incremental compilation using typst watch goes down to ~20ms after that. Do you plan to implement some kind of csv -> data function?

What do you mean by CSV->data; what would you expect from such function? You can map your rows to points quickly via .map(row => (0, 1).map(col => float(row.at(col)))) - this compiles faster than the push version.

johannes-wolf commented 1 month ago

I've pushed a fix :).