JonathanReeve / sanger

Margaret Sanger Papers Project Search Engine
0 stars 3 forks source link

push from GitHub webhook should automatically initiate parse script #72

Open CathyHajo opened 9 years ago

CathyHajo commented 9 years ago

I loaded two new versions of files-- 421058 Speech at the Santa Rita Hotel, and 421997 (the infamous one with the conflicts) Address on the Woman Rebel Case Dismissal. When I immediately searched for "Santa Rita" in the digital ocean Sanger papers, it came up with all the new edits. When I searched for "dismissal" in the title, 421997 did not come up, nor when I searched for Woman Rebel in title. When I manually put the url in, it was there, with the new corrections. Weird.

JonathanReeve commented 9 years ago

I'm guessing this is because the new XML file has successfully made it to GitHub and then to the server, but the file hasn't been parsed into the database, and so is invisible to the search engine. The automated updating script from #65 will pull in the changes from GitHub, but it won't automatically run the parsing script. If the file you changed is in xml_added, you might have to do this:

  1. commit and sync all changes you've already made
  2. move the file to xml_queue on your local copy
  3. commit and sync those changes (with a commit message like "moved document to XML queue for parsing", maybe). This should automatically push this file to the server.
  4. visit http://sangerpapers.org/sanger/app/documents/parse2.php, where you can select and parse the newly-corrected file.

But if the file is in xml_queue, just do steps 1 and 4.

This brings up some interesting ideas: