juliema / label_reconciliations

Code for reconciling multiple transcriptions for a label
MIT License
26 stars 11 forks source link

reconciliation reading a select menu as text #46

Closed juliema closed 6 years ago

juliema commented 6 years ago

In the latest phenology project -- Evening Primrose -- the only question in there is a select menu and it is being read as text.

rafelafrance commented 6 years ago

The data in the annotations field is being delivered in the same way as text fields are delivered. I am not changing how we parse the annotations because that would completely break every other reconciliation.

Question: Was this actually a drop-down list or some new type of field like a radio button?

Having this field interpreted as text does not materially change the output. However: What I can do is add an new field type "radio" to handle the difference if needed.

FYI: You need to remove the first data line (line 2) from the CSV file you sent. It doesn't belong to this expedition.

denslowm commented 6 years ago

To answer the question. The task is called a "Question." To the user it does appear as a radio button.

rafelafrance commented 6 years ago

As a compromise, I am adding an "Exact match" option to the text field type.

PmasonFF commented 6 years ago

Likely the best way to handle this is to flatten the question Json with an auxiliary script so it it is a straight csv file and use reconcile.py with the -f csv argument. Otherwise you will end up playing wack-a-mole trying to fit reconcile -f nfn to every special case. This is a fairly trivial exercise to flatten this annotations field with scripts to do it readily available in the zooniverse datadigging Git Using reconcile.py to aggregate and produce a consensus for a simple question task is a bit like using a sledge hammer to place tacks as well.

rafelafrance commented 6 years ago

The nfn.py script does flatten the various json structures. (Albeit, my python & pandas skills were much weaker then I wrote the script.) The problem is the structure itself; it's ambiguous in situations like this one. What I should do, but have avoided so far, is to see if I can use the workflow CSV to guide the parsing. But that's complicates things for the end users because they'll need another file to do the parsing. Then again, what's my audience? 4-5 people? Anyway...

I added the exact matches to the text field. If this doesn't break things I'll tag a new release. @denslowm, @juliema please let me know if things are OK.

PmasonFF commented 6 years ago

Then again, what's my audience? 4-5 people?

Precisely! And the tools to handle these things and generate a file suitable the current reconcile.py can handle are readily available elsewhere - meanwhile reconcile.py grows in complexity which is not good.

juliema commented 6 years ago

I did a quick run on the primrose lab and things look good!

On Tue, Apr 3, 2018 at 12:54 PM PmasonFF notifications@github.com wrote:

Then again, what's my audience? 4-5 people?

Precisely! And the tools to handle these things and generate a file suitable for the current reconcile.py can handle are readily available elsewhere - meanwhile reconcile.py grows in complexity which is not good.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/juliema/label_reconciliations/issues/46#issuecomment-378320083, or mute the thread https://github.com/notifications/unsubscribe-auth/ADQM7oK9wCa9oiviuNn4bflNMh6X0mlGks5tk6kNgaJpZM4TDknM .

rafelafrance commented 6 years ago

@PmasonFF Making code go away is a great thing. What tools do you have (maybe re/send links)? If it's a matter of training my handful of users to use them and then, maybe reconciliation can put some "value added" on top of that to pick up the specific requirements.

PmasonFF commented 6 years ago

Specific to flattening question tasks - https://github.com/zooniverse/Data-digging/tree/master/example_scripts/Building_blocks/Questiontasks Working with Transcription tasks and Nfn reconcile.py : https://github.com/zooniverse/Data-digging/tree/master/example_scripts/Building_blocks/Transcriptiontasks

These are just my sections of the zooniverse Git - there are many other project specific examples too. For my stuff one needs to pick appropriate blocks, add them to a basic frame work to flatten the file so it can be fed directly to reconcile.py (see working with reconcile.py in the readme's. With the appropriate blocks one can reduce any transcription, drop-down, single question answer, or text subtask for drawing tools to single text field in a csv file suitable to run with reconcile.py. Careful pre-formating can even make columns of numbers such as temperatures or pressures suitable for reconciliation - (for numbers decimal; points count so they can be replaced with characters that reconcile.py can give appropriate weight to such as replacing 31.59 with 31dp59, and converting back after the reconciliation.)