zjuwss / gnnwr

A PyTorch implementation of the Geographically Neural Network Weighted Regression (GNNWR)
GNU General Public License v3.0
56 stars 7 forks source link

Crash for > 100.000 rows #6

Open usereight8 opened 2 months ago

usereight8 commented 2 months ago

Both gnnwr and gtnnwr work fine for a DataFrame of up to 10.000 rows, but for 100.000 rows or above either visual studio code crashes or a memory allocation error occurs when executing the init_dataset method. I have tried converting the column type from float64 to float16, however they are probably converted back to float64 during the train/test split.

Is there a way either via an alteration in the code or by providing chunks of data sequentially to init_dataset to address this problem?

yorktownting commented 1 month ago

Yes, it is a challenge for the current gnnwr model. We are actively seeking a solution to this problem. We believe that the structure of the neural network may need to be adjusted to avoid calculating a distance matrix that contains all the samples, which can be a swallower of memory.