OpenSextant / OpenSextantToolbox

A geotagger and entity extractor
Other
15 stars 7 forks source link

docker #12

Closed johnrfrank closed 8 years ago

johnrfrank commented 9 years ago

I wrote a dockerfile:

https://github.com/streamcorpus/OpenSextantToolbox/tree/master/docker

before i send a pull requests, let's discuss how version numbers get created. What do you thinko of using git tags?

dlutz2 commented 9 years ago

git tags would be fine. I have been pretty slack on getting out of snapshot mode to lock down a version beyond 2.1

johnrfrank commented 9 years ago

I'm not sure if tags get pulled across in pull requests; let's assume that they do. Are you okay with receiving these tags that I made in my fork?

https://github.com/streamcorpus/OpenSextantToolbox/releases

As you can see in the Dockerfile, it pulls one of those in its build process.

It would be nice if the ant release process also read the current tag. In python projects, we do this by generating a RELEASE-VERSION file when cutting the artifact. There is probably a nice way to do this with maven too. Here is a relevant post: http://stackoverflow.com/questions/2863756/is-there-a-single-git-command-to-get-the-current-tag-branch-and-commit

John

dlutz2 commented 9 years ago

I saw that the docker file fetched a built artifact as well as the gazetteer data. We haven't been putting any built artifacts (releases) on github. I was under the impression that a free project was allowed only a limited amount of storage (~1-2 GB?) which we would quickly exceed. StreamCorpus is a paid-for project, yes? I would love to get the prebuilt releases and the gazetteer data off of the opensextant.org site since we have had so many issues with it.

johnrfrank commented 9 years ago

streamcorpus is also a free FOSS project. the github hosting is probably only useful for the code artifacts, which will also eventually be too big and we'll have to delete some old versions.

The large gazetteer releases need a different home. Have you looked at s3? You could put the whole opensextant.org site in a s3 bucket, and use Route53 for DNS. It costs three cents per GB/month. That's about 25 cents/year for each release.

http://aws.amazon.com/s3/pricing/

Here's an example of a Makefile that we use for one of our public website's hosted in S3: https://github.com/trec-dd/trec-dd.org/blob/master/Makefile

dlutz2 commented 8 years ago

Build now uses released gazetteer data from OpenSextant Gazetteer.