Getting `.columns` of LazyFrame created by `scan_parquet` is slow and uses a lot of memory

Checks

[X] I have checked that this issue has not already been reported.
[X] I have confirmed this bug exists on the latest version of Polars.

Reproducible example

Unfortunately, the data cannot be shared. However, the file is fairly large at 3,604,869 rows and 3,508 columns consisting of a mix of Int64 and String columns. When trying to view the columns of the LazyFrame like below

pl.scan_parquet(file).columns

I get:

CPU times: user 13.5 s, sys: 5.62 s, total: 19.1 s
Wall time: 19.2 s

Peak memory usage: ~14GB
Final memory usage: ~1GB

Log output

No response

Issue description

Curiously, on another similarly-sized parquet file which is 3,759,545 rows and 3,002 columns, I get

CPU times: user 49.3 ms, sys: 28.8 ms, total: 78.1 ms
Wall time: 81.1 ms

and negligible (maybe 100-200mb) memory usage, which is more along the lines of what I was expecting. As far as I know, the parquet file with poor performance was written out using sink_parquet with default options, so I don't think there is any difference in compression ratio or algorithm to the "normal" parquet file.

I also decided to compare getting the first 500 rows with DuckDB, since I am not sure if getting the columns is a 1:1 operation (I believe DuckDB reads the parquet metadata; I am not entirely sure if Polars does this).

pl.scan_parquet(f).fetch(500)

yields

CPU times: user 32.9 s, sys: 13.8 s, total: 46.7 s
Wall time: 47.1 s

Peak memory usage: ~25GB
Final memory usage: ~1.5GB

while

duckdb.sql("SELECT * FROM read_parquet('file.parquet') LIMIT 500")

yields

CPU times: user 14.5 s, sys: 6.23 s, total: 20.7 s
Wall time: 20.6 s

Peak memory usage: ~4.5GB
Final memory usage: ~1.5GB

Expected behavior

I would expect running .columns on a LazyFrame to return almost instantly.

Installed versions

``` --------Version info--------- Polars: 0.20.23 Index type: UInt32 Platform: Linux-4.18.0-372.71.1.el8_6.x86_64-x86_64-with-glibc2.28 Python: 3.10.6 (main, Oct 24 2022, 16:07:47) [GCC 11.2.0] ----Optional dependencies---- adbc_driver_manager: cloudpickle: 2.1.0 connectorx: deltalake: fastexcel: fsspec: 2021.10.1 gevent: hvplot: matplotlib: 3.8.3 nest_asyncio: 1.5.5 numpy: 1.23.5 openpyxl: 3.0.10 pandas: 2.2.1 pyarrow: 15.0.0 pydantic: pyiceberg: pyxlsb: sqlalchemy: 1.4.39 xlsx2csv: xlsxwriter: ```

pola-rs / polars