Closed rbavery closed 1 year ago
I made a figshare for the vector datasets and a smaller version of the problem file.
This now downloads in about a minute for me. What about for folks in the netherlands? @rogerkuou @fnattino ?
https://figshare.com/ndownloader/files/37729413
Should we go ahead and use the fighsare instead of pdok for the setup instructions and learner's would download the dataset from figshare instead? I haven't checked out the pdok license info yet but I assume we can distribute these on figshare instead.
I had a look at the metadata for the crop dataset here and it seems to list three licenses: http://creativecommons.org/publicdomain/mark/1.0/deed.nl http://inspire.ec.europa.eu/metadata-codelist/ConditionsApplyingToAccessAndUse/noConditionsApply http://inspire.ec.europa.eu/metadata-codelist/LimitationsOnPublicAccess/noLimitations
They certainly look like hosting it on figshare would be fine, but perhaps someone else knows better?
For example
Van dit werk is vastgesteld dat er geen bekende auteursrechtelijke beperkingen op rusten, alle aanverwante en naburige rechten daarbij inbegrepen. Je mag het werk zonder toestemming kopiëren, wijzigen, verspreiden,en uitvoeren, zelfs voor commerciële doeleinden.
Translates to:
This work has been determined to have no known copyright restrictions, including all related and neighboring rights. You may copy, modify, distribute, and perform the work without permission, even for commercial purposes.
I see @rbavery, indeed taking it so long to download is not really acceptable. Downloading the dataset you have created from figshare takes less than a minute here as well, so we can go ahead and use this as source of the vector data (and really thanks a lot for checking the licenses @raar1!). Do you agree @rogerkuou ?
We will use the updated data on Figshare from now on.
When downloading the datasets with the curl command suggested in the episode setup, it takes over a minute to get to 4% progress (I'm on the US west coast).
from https://esciencecenter-digital-skills.github.io/geospatial-python/setup.html
I think we need to find a solution for this. Ideally, download of these data would only take at most a minute. Some solutions:
I'm not sure what the cost implications of these different options are. Before when I used figshare for the small raster datasets, I think we were around 300Mb hosted in a single location and it took about a minute to download on the east coast and west coast.
What do you think @fnattino @rogerkuou ?
also it looks like this file is the culprit, it's half a Gb: brpgewaspercelen_definitief_2020.gpkg