Open hltcoe-bot opened 4 years ago
vandurme ccostello charman
This seems like the most relevant existing ticket, so moving discussion here. Cash's list is a superset of what I mentioned on github. For completeness, here is the list of issues from his recent email:
My goal, indeed, is to make this first work for non-javascript-reliant templates, without disrupting any existing functionality and entirely opt-in on the project+worker level.
Poster: Thomas Lippincott
lippincott I have thought more about this and I think it is best to do the form filling from saved data using JavaScript. The advantage over doing this with Python is that there would be a DOM to work with. We will have to get all the inputs and match their names to the names of the variables. It's not as simple as setting values because of inputs like checkboxes so some input types will need specific logic.
Poster: Cash Costello
vandurme ccostello charman
I have a branch that implements persistent tasks without disrupting existing functionality, but it's fairly limited/dangerous (only handles text and checkbox inputs at the moment, and is completely vulnerable to injection attacks) and made some very inelegant choices to display the functionality quickly (e.g. just adding another list, "persistent tasks", to the landing page). On the other hand, it's almost 100% additional code, rather than modifications, so I'll try to commit it carefully and piecemeal to the branch so parts can be taken/left as needed. It works for my purposes, but I'm interested in rounding it out (handling all input types, securing the form-restoration, etc), so maybe we should have a call later in the summer after folks take a look. I'll also add a working example.
Cash and Craig, I'm having a bit of an issue with server response-time: there's some really suboptimal code in what I wrote, but it's pretty snappy on my laptop, while on a pretty beefy AWS instance many clicks will hang for tens of seconds. This happens with both MySQL and Sqlite3 backends, using IP and FQDN (so not a DNS thing), and CPU/memory/disk use are all very low. And the AWS instance has ~100% of its bandwidth/disk "credits" sitting unused, which is the only AWS-specific recommendation I found to look at. Have you guys ever experienced this sort of issue on EC2? Just a Hail Mary...
Poster: Thomas Lippincott
lippincott I'm quite busy today, but will try to take a look at this. When you have a branch that I can look at, point me to it. If you have an EC2 instance that I can check out, I can do some profiling to figure out where the issue lies.
Poster: Cash Costello
ccostello thanks! do you have an ssh public key I could put on the server so you can log in?
Poster: Thomas Lippincott
sent an email with key
Poster: Cash Costello
ccostello Thanks: you should be able to ssh into ec2-user3.87.135.105 now.
The "turkle" directory has a checkout of branch "issue-171-persistent-HITs", and a venv to source. It also has two non-repo files, "turkle/management/commands/load_chaucer.py" and "rebuild.sh". Invoking the latter will tear down the existing setup, rebuild and redeploy the images, and load the tasks, and is self-explanatory. I went the Docker route because I thought the slowdown might be sqlite-related, but I guess not.
You can log into xxxxx using user/pw xxxxx/xxxxx. The slowdowns seem nondeterministic, but if you click down into the first task and start moving around with e.g. shift-left/right, you should see that it sometimes just hangs for many seconds before proceeding.
I assumed there would be a performance issue with how I'm calculating "partial-completeness" of projects/batches (maybe I can do the non-invasive thing and create new "Task/Batch/ProjectCompletionState" classes), but I don't think that's what's to blame here, for a variety of reasons (in particular the nondeterminism and the fact this is a toy data set).
Poster: Thomas Lippincott
I can ssh to the server, but I cannot hit the website on 8080. Do I need to be on JHU vpn to access that port?
Poster: Cash Costello
ccostello hmmm, no, it's open (I'm not on VPN, and just checked the security group and it has the same spec as SSH)
Poster: Thomas Lippincott
It must have been something with APL's firewall as I'm able to access to the web site now. I'm getting a 500 error after logging in. I'm checking on that now.
Poster: Cash Costello
ccostello that's odd: I wasn't seeing crashes, just the slowdown.
Poster: Thomas Lippincott
Were you checking your browser's developer tools - specifically the Network tab?
Poster: Cash Costello
For the currently running instance, I see that it started at 2020-05-26 11:28:07 local time and that the first crash happened at 2020-05-26 11:30:10. If it was happening on an endpoint being hit by ajax, you wouldn't see it unless you checked the Network tab or maybe the console.
It is also possible that this is a completely separate issue.
Poster: Cash Costello
I'm going to restart the docker containers and test from a clean install.
Poster: Cash Costello
Would you like me to do anything code-wise, or just leave you to it for a few hours? It seems like maybe it would be easier to hang tight until you've poked around and then work on it afterwards.
Poster: Thomas Lippincott
mentioned in issue #262
Poster: Cash Costello
Give me an hour.
I also noticed a race condition with mysql starting up:
MySQLdb._exceptions.OperationalError: (2003, "Can't connect to MySQL server on 'db' (111)")
At some point I thought I did something so that the application would wait for the database to start up, but maybe that was on a different project...
Poster: Cash Costello
Ah, that's new, but makes sense. I had to throw a random sleep command into the bash script to wait for the management commands to finish: last I checked, the Docker folks didn't want responsibility for mechanisms to monitor internal container state. Seemed like a fair line to draw, not sure where it stands these days.
Poster: Thomas Lippincott
I think the issue is with gunicorn. I've been able to reproduce on your EC2 instance with a clean turkle install. Something is stalling for 30 seconds and then either a connection is closed or a worker is killed and restarted. I have a meeting at 1 pm. I'll do some testing after that.
Poster: Cash Costello
ccostello thanks, I really appreciate it!
Poster: Thomas Lippincott
We normally run gunicorn behind a reverse proxy like nginx or apache. Looks like the issue is that web browsers will sometimes open additional connections to your server. Details here: https://hackernoon.com/chrome-preconnect-breaks-singly-threaded-servers-95944be16400
Because we use a reverse proxy, we don't see this issue. I also use more than one worker usually which would help with this but does not solve the issue (if for example, you had multiple tabs open).
I tested a little bit with a changed gunicorn config and it seems to be working with a single browser. You can do a git diff to see my changes. Any production use of this should have a reverse proxy in front of it.
Poster: Cash Costello
Thanks ccostello , this is so helpful, I really appreciate it: giving a demo tomorrow. I'll ping you and charman when the branch is handling arbitrary forms in a safe way.
Poster: Thomas Lippincott
ccostello charman vandurme (also cmay as they have mentioned this in other threads and may have other insights)
I think this is pretty close to ready, and I could have it cleaned up and merge-requested in a few hours' effort. Here are responses to Cash's bullet points in the issue:
Comments? Questions? Criticism? I would probably also make some minor template/view changes to consolidate the parallel code for the resumable case, and add a javascript file with the utilities for field-restoration.
Poster: Thomas Lippincott
ccostello charman cmay vandurme
If I don't here otherwise by end-of-day tomorrow (Friday), I'll start on these changes with the understanding that they are more-or-less acceptable and the effort won't be wasted (they'll still be subject to a merge request, just don't want to find out at that point that there are strong objections!)
Poster: Thomas Lippincott
You have my attention! Sorry for the delay.
To address your question on partial work, I don't think it should be exported because
As a framework builder it is the better option to not export partial results as that is more predictable.
I promise to give you more detailed feedback by noon tomorrow.
Poster: Cash Costello
ccostello Thanks! And no worries, I appreciate all of this. The partial task thing makes sense. I'm guessing perhaps the best place to hook in for things like regular backups and whatnot would be to add admin commands? I've been meaning to look into whether that can be done non-invasively, e.g. a project-specific repo define its own task-loader logic (that's what I've been doing for the Chaucer study, though I added the command directly to my turkle branch). I'm a bit paranoid about backups here, since the humanities folks put a ton of subtle effort into some of their "tasks", and their good will is pretty much my only academic currency right now!
Poster: Thomas Lippincott
Good point. There is a difference between the requester downloading all the data to process and the admin backing up everything. Right now we expect the requesters will use the web UI to download their data. There are also some scripts for this that we eventually want to move to a full blown API. We expect the admin to set up regular backup cron jobs that dump the entire database.
Our only documentation on the database backups is here: https://github.com/hltcoe/turkle/blob/master/docs/ADMINISTRATION.rst#database-backups If you think we need to provide more assistance in the documentation on backups and restoring from backups, open a issue so that we can capture that. Right now we're assuming the admin is comfortable with this, but maybe we're assuming too much.
Poster: Cash Costello
lippincott thank you for this, I think it will be very useful. My only feedback is about exports. For me, I would rather not get partial results in the export, but if I did, an explicit CSV field indicating whether each record was partial would alleviate that concern to a large degree. But I don't know if my approach/preferences are representative.
Poster: Chandler May
On task assignment expiration, I would hate to lose that just because I chose to turn on the resumable option.
Why did we add expiration: because annotators will accept a task and never complete it. For annotators this most likely happens due to having the auto assignment option set and then they leave the project or take time off. I'm assuming most projects keep the default of 24 hours.
I cannot imagine that the problem of abandoned tasks goes away with resumable projects.
What are our options:
I would start with option 2 and perhaps in the future add something like 3 or 5.
Poster: Cash Costello
lippincott charman I'm now thinking about the UI for this. First, how does the annotator submit a partial annotation?
Do we want separate buttons for submit (as finished) and save (as partial)?
I think we have to because any code we write to automatically detect the annotator's intention will be fail sometime.
If we require two buttons, should we do away with automatically detecting whether the template has a submit button? Instead, always require that the template has its own submit button. Our templates would still work with MTurk but MTurk templates that do not have their own submit button would not work on Turkle.
The reason I'm suggesting this is that it will start to get messy to have code that detects the presence of a submit button and a save button and properly create and position one or the other.
Poster: Cash Costello
ccostello charman vandurme cmay So, in the somewhat ad-hoc templates for persistent tasks I threw together, rather than any submit button, I just javascripted in shift+arrow key navigation forward/backward/up through the task assignments, saving state whenever leaving a page. I also made the browsing a bit hierarchical, so that persistent batches on the front page are colored by how complete their tasks are, and clicking on them led to a list of their tasks colored according to how complete they are. Ideally I would have had this nested deeper, because this was a book, with chapters, with 10-line chunks as the tasks, so with the current setup the front page had a batch for each chapter: messy, and would get a lot more so if there were e.g. multiple books.
That's all outside of what Amazon provides of course, so I'm not saying it needs to be addressed, but I do think that's a very typical scenario for persistent annotation: a tree, where the root/batch is something pretty damn big, like a book, leaves are tasks, and there's intermediary structure that would be helpful to have directly navigatable. It probably could be done non-invasively w.r.t. existing functionality by adding a model class just for internal node structure of persistent batches, and views that are only used for persistent batches with said structure.
Poster: Thomas Lippincott
lippincott I had played around a little with your UI when I was figuring out the performance issue.
I had considered a resumable task something that you can save with a partial result and then come back to later to finish. You can come back to that task as many times as you want to in order to update it and save it as partial, but once it is submitted, you cannot go back to edit it. Does your concept have a final submit on that task?
A second way your UI seemed different is that really the book or the chapter was a single task and each chunk was a sub-task. It makes sense for an annotator to move back and forth among the sub-tasks and for a single annotator to be assigned to the task.
Do agree with the distinctions that I am making here?
It would also help me to explain a little more what you mean by persistent tasks (or did I capture it with the idea of one large tasks with many sub-tasks that the annotator comes back to over a period of days/weeks).
Poster: Cash Costello
mentioned in issue #273
Poster: Craig Harman
Support users starting a task, saving intermediate results, and then coming back to it later.
Also, this would be the first feature that makes our templates incompatible with MTurk. I believe the goal would be to continue to support MTurk templates but offer a superset of features. So we won't require anything in a template that would makes use incompatible but we will add optional features to better support our use cases.
Poster: Cash Costello id: 171