sustainableaviation / demandmap

✈️🌐 Map of Global Air Transport (with Future Demand)
BSD 3-Clause "New" or "Revised" License
0 stars 1 forks source link

Investigate use of Sparse Matrices for Tabulated Data Storage #38

Open michaelweinold opened 1 week ago

michaelweinold commented 1 week ago

...since @dodedic's CSV files are currently ~40MB in size, you might want to look into: scipy.sparse.save_npz

import numpy as np
from scipy.sparse import csr_matrix
import scipy as sp
import pandas as pd

df = pd.read_csv('01-averageDailyFlights.csv', header=0, index_col=0)  # Adjust the path and options as needed
dense_matrix = df.values

# Step 3: Convert the dense matrix to a sparse matrix
sparse_matrix = csr_matrix(dense_matrix,)

sp.sparse.save_npz('test.npz', sparse_matrix)
dodedic commented 1 week ago

@michaelweinold

This was successful. Files are now 120 KB.