fatiando / community

Community resources, guidelines, meeting notes, authorship policy, maintenance, etc.
Other
8 stars 4 forks source link

Include extra independent input variables into verde_train_test_split (block) or verde_cross_val_score (block) and using polygons as inputs instead of northing/easting #85

Closed AAzzam91 closed 1 year ago

AAzzam91 commented 1 year ago

Hi @leouieda,

I came across this interesting tool and wanted to use some of the functions in this package for my master thesis for handling spatial data. I would like to ask two questions:

The first question that I would like to ask is about the train_test_split with spacing and the cross validation (BlockKFold and cross_val_score) functions. I have read the documentation and source code but could not find a direct approach to the following: How can I include extra independent or feature variables (input) into the following functions:

In the documentation examples I only see that the coordinates = X (two coordinate variables) and data = y (one target variable). Would it be possible to add extra input variables (X) besides the coordinates and still having the data split into blocks? If this would be possible, should those extra variables be added into the data argument (y) and then be splitted into X_train, y_train, X_test, y_test? or is there another approach to handle this?

The second question that I would like to ask is about the coordinates input type. Could for example polygons be used as coordinates inputs instead of points? or is it only possible to input coordinates such as x, y or lon, lat?

The dataset that I am using has:

Thank you for your advice and help.

Kind regards,

A.Azzam

leouieda commented 1 year ago

👋🏾 Hi @AAzzam91 thanks for asking this! I'm going to transfer this to our Forum instead since it's more of a questions. You don't have to do anything and should get automatically redirected there.