bmschmidt / medicalHeritageVM

A DH box for Miriam Posner and Ben Schmidt's 2016 workshops in Bethesda
6 stars 4 forks source link

Provision sample data for workshop #2

Open bmschmidt opened 8 years ago

bmschmidt commented 8 years ago

@miriamposner and I both have sample data. It's too big to bundle in to the git repo for the download.

I think the way to solve this is to add something like this line https://github.com/bmschmidt/medicalHeritageVM/blob/master/puppet/manifests/rstudio-server.pp#L110-L116 to the puppet file. It will use wget to hit a URL, downloading a batch of files, and then put them into an appropriate directory. Then some other puppet command will extract them. (Shell might work as well as puppet, for this.)

So we each just need to find somewhere on the Internet we can park our data for download.

miriamposner commented 8 years ago

This sounds great. Do you think this will work https://ucla.box.com/s/3t2jgupei1brtkvm21leoho4lgw8doe4? There's an intermediate step, in that the "Download" button has to be clicked on.

On Mon, Mar 14, 2016 at 2:19 PM, Benjamin Schmidt notifications@github.com wrote:

@miriamposner https://github.com/miriamposner and I both have sample data. It's too big to bundle in to the git repo for the download.

I think the way to solve this is to add something like this line https://github.com/bmschmidt/medicalHeritageVM/blob/master/puppet/manifests/rstudio-server.pp#L110-L116 to the puppet file. It will use wget to hit a URL, downloading a batch of files, and then put them into an appropriate directory. Then some other puppet command will extract them. (Shell might work as well as puppet, for this.)

So we each just need to find somewhere on the Internet we can park our data for download.

— Reply to this email directly or view it on GitHub https://github.com/bmschmidt/medicalHeritageVM/issues/2.

bmschmidt commented 8 years ago

Hmm, probably not: but I could just store the file on my server for the time being. I'll let you know if that doesn't work; then maybe dropbox or google drive or something.

Currently wading through some really frustrating changes to the default MySQL configuration files that break everything on my end before I can fix this…

miriamposner commented 8 years ago

Ah, excellent, technology is great.

I just put my files on my server instead: www.miriamposner.com/files/xray.zip.

On Mon, Mar 14, 2016 at 4:34 PM, Benjamin Schmidt notifications@github.com wrote:

Hmm, probably not https://community.box.com/t5/Help-Forum/Can-I-use-wget-to-download-data-from-BOX/td-p/7803: but I could just store the file on my server for the time being. I'll let you know if that doesn't work; then maybe dropbox or google drive or something.

Currently wading through some really frustrating changes to the default MySQL configuration files that break everything on my end before I can fix this…

— Reply to this email directly or view it on GitHub https://github.com/bmschmidt/medicalHeritageVM/issues/2#issuecomment-196569499 .

bmschmidt commented 8 years ago

OK, I've successfully started a box that downloads both xray from miriamposner.org and sample_files.zip from benschmidt.org. A few other problems remain; I'm not sure if it unzips the xray file properly.

The general strategy is going to be that there are two directoris at /images and at /texts. That's where all the specific workshop stuff will go. Should make it easier for participants, and means if we or anyone else wants to fork the machine later, it will be easy to eliminate that from the process. I've put your python script in that folder.

miriamposner commented 8 years ago

Huh, OK, so it installs the box OK, but when I run vagrant up, I get the following error message:

Bringing machine 'default' up with 'virtualbox' provider...
There are errors in the configuration of this machine. Please fix
the following errors and try again:

vm:
* The host path of the shared folder is missing: texts

Any thoughts about this?

miriamposner commented 8 years ago

Oh, actually, I see an "Images" directory but not a "Texts" directory in the Vagrant folder I downloaded from Github. I added a "texts" folder and it seems to be working now. I'll let you know how it all goes in a bit.

bmschmidt commented 8 years ago

Ugh, apparently git won't let you commit an empty folder. I've added a temporary fix. But…

miriamposner commented 8 years ago

Oh, bummer. Anyway, it mostly seemed to work! My images are there, Bookworm is there at http://localhost:8007/D3, your texts are there, I'm able to SSH in.

The only thing is, I'm not actually sure it's installing OpenCV. When I run locate-figures.py, it can't find a module named cv. I can look at this again tonight and see if I made a mistake with the Vagrantfile.

bmschmidt commented 8 years ago

Yeah that's right, I don't see it either. This is kind of my fault: I deleted a line from the Vagrantfile that I thought shouldn't make a difference. Was it working before?

miriamposner commented 8 years ago

Yes, it was, but I had the file in a different Vagrant folder thing (without all of the extra text stuff) and only copied the Vagrantfile over into medicalheritageVM right before I made the pull request. So maybe something with the new file structure messed it up?

miriamposner commented 8 years ago

trying now with old vagrantfile in medicalheritageVM

miriamposner commented 8 years ago

OK, yeah, OpenCV did actually seem to work just now (using my Vagrantfile in the new VM).