tnc-br / ddf-isoscapes

3 stars 0 forks source link

Add more capabilities to XGB model #162

Open benwulfe opened 1 year ago

benwulfe commented 1 year ago

Some of these options will make their way to the ingestion.ipynb (such as pulling from GEE). For now, these are only in XGB as proof of concept.

These changes also (importantly) allow the XGB colab to consume already-split sets, although right now it does not support the test set.

review-notebook-app[bot] commented 1 year ago

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

review-notebook-app[bot] commented 1 year ago

View / edit / reply to this conversation on ReviewNB

rothn commented on 2023-08-01T20:58:42Z ----------------------------------------------------------------

Line #18.    def load_training_validation() -> dataset.PartitionedDataset:

There's a lot of flexibility here. I'm curious whether we need all of it.


review-notebook-app[bot] commented 1 year ago

View / edit / reply to this conversation on ReviewNB

rothn commented on 2023-08-01T20:59:03Z ----------------------------------------------------------------

Line #10.    _OVERLAP_PARTITION_STRATEGY = dataset.FixedPartitionStrategy(
  # Train
  dataset.DatasetGeographicPartitions(
    min_longitude=-60.5,
    max_longitude=float('inf'),
    min_latitude=float('-inf'),
    max_latitude=float('inf'),
  ),
  # Validation
  dataset.DatasetGeographicPartitions(
    min_longitude=float('-inf'),
    max_longitude=-60.5,
    min_latitude=float('-inf'),
    max_latitude=float('inf'),
  ),
  # Test
  dataset.DatasetGeographicPartitions(
    min_longitude=float('-inf'),
    max_longitude=float('inf'),
    min_latitude=float('-inf'),
    max_latitude=float('inf')
  )
)

Would you mind explaining the partition strategy here and what it tries to accomplish?