mlpack / models

models built with mlpack
https://models.mlpack.org/docs
BSD 3-Clause "New" or "Revised" License
33 stars 42 forks source link

Distribution of preTrained models #66

Closed Aakash-kaushik closed 3 years ago

Aakash-kaushik commented 3 years ago

Creating an issue to discuss how to distribute pretrained model files. cc: @rcurtin

gdrive link for resnet models: https://drive.google.com/drive/folders/15qqEhZnylW6OTk7HYrwOWiXzTiAK8-h6?usp=sharing

rcurtin commented 3 years ago

We'll need to modify the website build scripts just a little bit. I think this should probably be separate from the datasets/ directory. (For that, the workflow when building the website is just that we get a file datasets.tar.gz, which is hosted locally under a different domain name on mlpack.org, as well as on Zenodo, and unpack it.) I guess we can do the same thing here... just make a models.tar.gz that we maintain somewhere (not sure where? I guess we could use Zenodo again?), and then host it 'locally' under a different subdomain.

Another idea could just be to make a separate subdomain models.mlpack.org, which is just a static subdomain that we (manually) dump files into. In fact, we could even do the same thing for the datasets.

Any thoughts or preferences on how to do it? Honestly I am leaning a little bit towards the models.mlpack.org strategy---and even adapting the datasets scripts to live in datasets.mlpack.org/ (and thus simplifying the website rebuild process).

@Aakash-kaushik so that you can continue development work, for now I'm just putting the gdrive models in https://www.ratml.org/misc/models/. Once we figure out what to do here, we can then just change the URL in the code accordingly.

(CC: @zoq since you've been pretty involved with the dataset setup)

rcurtin commented 3 years ago

(The models are still uploading... it will be done in probably an hour or so...)

Aakash-kaushik commented 3 years ago

@rcurtin i also lean more towards the idea of hosting static sites rather than having the whole process of expanding zips everytime we build the sites, also doesn't makes sense. As you said we can have a simple place where we can just dump files. And if you want just create one which can maybe have directories as url paths? Say mlpack.org/datasets and similar for models?

Aakash-kaushik commented 3 years ago

(The models are still uploading... it will be done in probably an hour or so...)

Thanks for uploading the models for time being

rcurtin commented 3 years ago

Yeah, I wonder why I didn't use a subdomain in the first place...

It would be way easier to set up models.mlpack.org/ as opposed to mlpack.org/models/, since all of mlpack.org/ will be rebuilt every night and I'd need to make sure that the models/ URL correctly forwarded. So unless there is some reason not to that I overlooked, I think it would fine to host as, e.g., models.mlpack.org/resnet/resnet34.bin.

Aakash-kaushik commented 3 years ago

I believe models.mlpack.org would be a better choice then.

zoq commented 3 years ago

I would also just go for a subdomain, but I think there is no downside in using mlpack.org/models/. That said, I'm not sure you need models.tar.gz I think users are interested in specific models so I would probably just offer the models file and the corresponding labels.txt file.

Aakash-kaushik commented 3 years ago

Hey @rcurtin let me know when we have a final place where we put the models, that way I would update the current code to account for the change.

rcurtin commented 3 years ago

Done: http://models.mlpack.org/resnet/

The datasets still need to be changed though. I guess that is not relevant to this issue though, so we can close this one. Let me know if there is more I should add to that subdomain. :+1:

Aakash-kaushik commented 3 years ago

Done: http://models.mlpack.org/resnet/

The datasets still need to be changed though. I guess that is not relevant to this issue though, so we can close this one. Let me know if there is more I should add to that subdomain.

I believe that's all. Thank you so much ! :tada: