zooniverse-glacier / notesFromNature

https://www.notesfromnature.org/
Apache License 2.0
13 stars 11 forks source link

Progress accuracy #318

Closed joanball closed 10 years ago

joanball commented 10 years ago

Users have commented that the total for macrofungi (currently 43) seems inaccurate. Are there still problems with the accuracy of progress stats? If so, is it only macrofungi, or might there still be inaccurate totals for the other collections as well?

robgur commented 10 years ago

I don't think these numbers are necessarily wrong, but people might not understand that we need four transcripts per record before the process is done. What we don't have, and I'd like to add it as a priority fix, is number of transcriptions completed for a collection (as opposed to number of subjects finished). This would help people see that progress is happening even if the subjects are finished.

-r

On Tue, Jan 7, 2014 at 10:55 AM, joanball notifications@github.com wrote:

Users have commented that the total for macrofungi (currently 43) seems inaccurate. Are there still problems with the accuracy of progress stats? If so, is it only macrofungi, or might there still be inaccurate totals for the other collections as well?

— Reply to this email directly or view it on GitHubhttps://github.com/zooniverse/notesFromNature/issues/318 .

steveraden commented 10 years ago

I want to put some time later this week to reviewing the past issue as well as Rob's proposal. I agree think displaying both numbers probably has value. Also, the previous weeks I tried to get user darryluk who had commented on his personal totals problems to send us some debug information, but didn't here back.

robgur commented 10 years ago

Cool, Steve. Thanks. I will see if I can connect back with darryluk again. We can only try so hard... -r

On Tue, Jan 7, 2014 at 12:01 PM, sraden notifications@github.com wrote:

I want to put some time later this week to reviewing the past issue as well as Rob's proposal. I agree think displaying both numbers probably has value. Also, the previous weeks I tried to get user darryluk who had commented on his personal totals problems to send us some debug information, but didn't here back.

— Reply to this email directly or view it on GitHubhttps://github.com/zooniverse/notesFromNature/issues/318#issuecomment-31767646 .

joanball commented 10 years ago

So, you are confident that the current numbers reflect the four transcription issue and is not innacurate? I would like to respond to the comments in a timely manner.

FYI, Darryluk may have abandoned the project. He used to be very active everyday answering other users questions but has been completely absent for the past few weeks.

joanball commented 10 years ago

Well, right after I said that he responded to a ton of questions in the forum. And started a new thread, saying "I still can't get my totals to increase and I have not heard that the issue can be resolved, so I haven't done any work on here for several weeks now" So, I directed him to the old thread. Seems like another communication problem.

steveraden commented 10 years ago

I just messaged him directly referring him to http://talk.notesfromnature.org/#/boards/BNN0000002/discussions/DNN00001o0where I had left details (12/23) on how he could help us diagnose his problem.

On Tue, Jan 7, 2014 at 8:12 PM, joanball notifications@github.com wrote:

Well, right after I said that he responded to a ton of questions in the forum. And started a new thread, saying "I still can't get my totals to increase and I have not heard that the issue can be resolved, so I haven't done any work on here for several weeks now" So, I directed him to the old thread. Seems like another communication problem.

— Reply to this email directly or view it on GitHubhttps://github.com/zooniverse/notesFromNature/issues/318#issuecomment-31800104 .

poboyski commented 10 years ago

Our transcribers are getting frustrated that their personal totals are not increasing (just like darryluk). One of them wrote me personally to say, "There are several of us working on the CalBug project who have been posting a complaint that hasn't been addressed. The count that you accumulate by doing bugs isn't working. I have done over 400 (if not more), spending entire 9 hour days doing nothing but (I love bugs. Fascination and obsession doesn't even describe it.) I have a credit of 43, which went out today. Since I began 2 weeks ago it stopped at 36."

joanball commented 10 years ago

Any news on the status of overall progress accuracy?

Here is one recent comment about the overall project accuracy in relation to an individual's completed transcriptions.

by tmeconverse I'm not sure it's working quite yet. In Herbarium my numbers are now advancing properly, but there has been no advancement in the number of specimens completed. Before the meltdown, when I did 20 labels, the total completed in the Herbarium count might go up 3 or 4. The last three digits of the collection items completed has remained at 866 for the last three days, and I've done over a hundred in that time period. I also still don't understand the percentage finished. Before the fix, we were 48 % done with 28,000-ish total labels in the data base. Now we are 55 % done with around 55,000-ish labels. Does that mean that the earlier total of 28,000-ish was really at 100% before, or that the remaining 10,000-ish labels from before the melt-down are no longer in the mix to be done?

joanball commented 10 years ago

Just wanted to update this with recent complaints in the past day about the total transcriptions.

Herbarium has been on 55% completion 28,866 / 52,552 records for a while, possibly since the new records were added when it ran out and appeared to be down... http://talk.notesfromnature.org/#/boards/BNN0000002/discussions/DNN00001s3?page=1&comment_id=52f4eea5f7e39829cf000da4

bump Is there any way the totals of the projects could get updated? I don't think it's very helpful to have them sitting at the exact same numbers since quite a long time now. To us, it looks like we're not making any progress at all. :-( If the tool doesn't work, why show it? http://talk.notesfromnature.org/#/boards/BNN0000002/discussions/DNN00001sf?page=1&comment_id=52f23093f7e3986e1700050a

denslowm commented 10 years ago

Steve has been working on this and I just got off the phone with him. I will include you in the message I am about to send.

robgur commented 10 years ago

I think we'll get an update from Steve soon. It is possible this isn't a mistake, Joanie. Here is what I think is happening. So, when new records come into Notes from Nature, the tool randomly picks news records to transcribe. It takes a long time before, on the random draws, a record gets four transcriptions. If we have, says, 20,000 images, we could have thousands of transcriptions done and never see a subjectid have 4 transcriptions. As we exhaust the pool of 80000 transcriptions, the complete rate for each image/subjectid goes faster and faster. I imagine the surprise with having stuff rapidly finish in the herbarium was due to a 1000s of images/subjectids with 3 transcriptions. We could of course the model probability functions for these. The main message here is that we need transcriptions completed and subjectids/images completed both --- with that, we are in much better shape to explain progress. -r

On Fri, Feb 7, 2014 at 1:12 PM, joanball notifications@github.com wrote:

Just wanted to update this with recent complaints in the past day about the total transcriptions.

Herbarium has been on 55% completion 28,866 / 52,552 records for a while, possibly since the new records were added when it ran out and appeared to be down...

http://talk.notesfromnature.org/#/boards/BNN0000002/discussions/DNN00001s3?page=1&comment_id=52f4eea5f7e39829cf000da4

bump Is there any way the totals of the projects could get updated? I don't think it's very helpful to have them sitting at the exact same numbers since quite a long time now. To us, it looks like we're not making any progress at all. :-( If the tool doesn't work, why show it?

http://talk.notesfromnature.org/#/boards/BNN0000002/discussions/DNN00001sf?page=1&comment_id=52f23093f7e3986e1700050a

Reply to this email directly or view it on GitHubhttps://github.com/zooniverse/notesFromNature/issues/318#issuecomment-34497131 .

steveraden commented 10 years ago

Rob, that's correct that the random weighting is going to give some to show completions.

I'll follow up the improvements next email, but I want to flash you some numbers.

There are the numbers in the backend for herbarium. I will send a follow up with of queries. This is a snippet that shows 55% is the right number as of today 28866 complete / 52552 total

I think the discrepancy that classification_count/4 <> complete is from the change in retirement, but would have to query that out of mongo later.

"classification_count": 127175, "stats": { "active": 23686, "complete": 28866, "inactive": 0, "paused": 0, "total": 52552 } //src: https://api.zooniverse.org/projects/notes_from_nature/groups

steveraden commented 10 years ago

Here is the progress I've Michael that was referring to, only deployed to demo. As we talked about it, and also looking at Robs comments. It seemed now to be not enough info.

Michael said he would talk you about in more length, but I wanted to get a basic progress out. I put things to actually take decision on in italics.

http://zooniverse-demo.s3-website-us-east-1.amazonaws.com/notes_from_nature/

Displaying the "total subjects" stat is probably the wrong verbiage and the wrong stat. No stat might be the right answer here.

steveraden commented 10 years ago

Fixes went up today for this.