Closed alvaromorales closed 9 years ago
Cool! Found this documentation; looks like I can set up Python to automatically download these files in the installation process of my module. Will probably need to use secret ENV variables like you did for Elasticstart
Just use the public URLs instead of the Swift client; no need to have full OpenStack access (security reasons).
Ok, thanks. How do you recommend me to upload new files, when needed?
We can get you an OpenStack account, or you can just send me the file and I'll upload it. We can think of automating this later on.
Also, do we really need to upload the DBpedia ontology HTML? Can we just use requests to access the URL and download it?
We can use OpenStack's Swift object storage to store data files that don't belong in version control.
HTML ontology classes: https://ceph.csail.mit.edu/swift/v1/infolab/ontology-classes.html Infobox counts: https://ceph.csail.mit.edu/swift/v1/infolab/infoboxes.tsv
These URLs should probably be in some config variable, and should be downloaded locally to a dir in .gitignore (to avoid downloading over and over).
For now these URLs are public, but we can generate temp urls with authentication if needed.
The files should also have a date associated with them (e.g.
infoboxes-2015-08-15.tsv
).