conservationtechlab / animl-py

Animl comprises a variety of machine learning tools for analyzing ecological data. This Python package includes a set of functions to classify subjects within camera trap field data and can handle both images and videos.
MIT License
6 stars 3 forks source link

Host Dataset externally - plot_boxes #13

Open srinidhi98 opened 1 year ago

srinidhi98 commented 1 year ago

Try to host dataset externally on any web server/cloud and use the URL to extract the data directly into dataFrame/download as a .csv file local repo.

iingram commented 1 year ago

I know that hosting on Kaggle is still amongst the options being considered but I do feel that that meaning there's the extra step for a fresh user of making an account on Kaggle to be able to do the data download is very plausibly a non-negligible barrier that we want to avoid. I am noting that to keep us thinking about it but I am certainly still game for seeing the full side-by-side that includes Kaggle as an option and fully weighing the pros and cons.

srinidhi98 commented 1 year ago

For Kaggle, yes the user must have

sample code: from zipfile import ZipFile import pandas as pd from kaggle.api.kaggle_api_extended import KaggleApi# Initialize Kaggle API api = KaggleApi() api.authenticate()# authenticate the owner's name and dataset dataset_loc = 'srinidhiyerabati/Test-box-plots'# username/Dataset name on Kaggle api.dataset_download_files(dataset_loc)# download zip_file_path = '/home/srinidhiyerbati/Desktop/Srinidhi_Yerabati/animl-py/animl-py/Test-box-plots.zip'

Extract the zip file

with ZipFile(zip_file_path, 'r') as zip_ref: zip_ref.extractall('/home/srinidhiyerbati/Desktop/Srinidhi_Yerabati/animl-py/animl-py/') file_path = '/home/srinidhiyerbati/Desktop/Srinidhi_Yerabati/animl-py/animl-py/detections_plotBoxes.csv' df = pd.read_csv(file_path) print(df.head())