Open henrykironde opened 3 years ago
Hey @henrykironde! Would love to start contributing, and I believe adding datasets might be a good place to start. Could I pick one up from the lot or would you be assigning any particular one?
Hi @pri1311, Feel free to pick any data source. Let me know in case you need any clarification.
Let me know in case you need any clarification.
I have added a simple dataset as of now to get a basic idea of the repository. If the PR is merged/approved, will move on to more datasets. I am particularly interested in a separate open issue - Adding support for sequence data.
Also, I had one small doubt. I was going through some of the json files in the retriever-recipes
repository. A lot of the Kaggle datasets were included. But since Kaggle allows downloading test and train data all at once as a zip file, how will those be added to this package? (Since I saw Kaggle mentioned as one of the data sources here.)
@pri1311 for sequence data, I have not found suitable sources yet, but you can go fo it.
since Kaggle allows downloading test and train data all at once as a zip file,
That is a good case since we download all the data using one url. We then extract all the files or we can extract a particular file. Checkout the Json files with extract
for some examples. https://github.com/weecology/retriever-recipes/search?q=extract.
Let me know incase you have more issues or need clarification.
ref
https://www.mrlc.gov/nlcd2011.phpcitation
: """ Preferred NLCD 2011 citation: Homer, C.G., Dewitz, J.A., Yang, L., Jin, S., Danielson, P., Xian, G., Coulston, J., Herold, N.D., Wickham, J.D., and Megown, K., 2015, Completion of the 2011 National Land Cover Database for the conterminous United States-Representing a decade of land cover change information. Photogrammetric Engineering and Remote Sensing, v. 81, no. 5, p. 345-354""" Also ref https://www.mrlc.gov/nlcd06_data.php