Accept optional date in `process_parquet` for more precise subsetting

Currently we take end_date of training data and go back one year to subset training time series: https://github.com/WorldCereal/presto-worldcereal/blob/e8d5bbc173c581d197c3810fdcdf3a8768e9bc9a/presto/inference.py#L397-L410

However, if we train a dedicated CatBoost on a subset of data for a small AOI, we may benefit from subsetting based on the requested start_date and end_date (has to be one year) by the user. Can we adapt the method to accept an optional argument, e.g. end_date which - if given - dictates the subsetting of the timeseries?

We then need to be careful to be resilient to different years of the training data, and also drop samples that don't fall entirely within the requested time frame (adapted for the year of the sample).

WorldCereal / presto-worldcereal

Accept optional date in `process_parquet` for more precise subsetting #95