juliema / label_reconciliations

Code for reconciling multiple transcriptions for a label
MIT License
26 stars 11 forks source link

Upgrade to v0.6.0 Causing issues with csv headers #81

Open rbruhn opened 3 months ago

rbruhn commented 3 months ago

With upgrading Biospex server to Ubuntu 22.04 and using Python 3.10.12, I was hoping to upgrade reconciliations to v0.6.0. When testing, everything works correctly and the files are generated. However, the column headers for reconcile, summary, and transcript have altered values. It looks like the Task number is being included. Here is an example header from my reconcile.csv file:

subject_id | T3_1 Location | T4_1 Habitat & Description | T6_1 Collected By | T7_1 Collector Number | T8_1 Month | T9_1 Day | T10_1 Year | T12_1 County | eol | mol | county | country | idigbio

All the columns after the tasks come out fine. The label reconciliation is being performed on classification.csv downloaded from Zooniverse. For now, I will try to install a previous version of Python in a virtual environment and run v0.5.1. If there is a way to fix this issue on v0.6.0 please let me know.

Edit: I can't get v0.4.3 working on Ubuntu 22.04 even though I installed Python3.8 and created a virtual environment. It keeps failing on the requirements install.

I did try v0.5.1 with Python3.10 and it worked. However, the columns we get back from Zooniverse are changed. For example, we have Zooniverse return certain information for us in the classifications. When creating the reconcile and transcript csv files, it appears like:

subject_eol | subject_mol | subject_county | subject_country | subject_idigbio | subject_imageURL | subject_recordId | subject_imageName

All of these have the "subject_" removed in v0.5.1 and v0.6.0. Since I use these in my code, and stored in MongoDB, I can't simply change them with past records. This really leaves me in a spot since I can't get v0.4.3 working.

rbruhn commented 3 months ago

Adding to the above, I see the --explanations argument and code was removed. We are using this to create our own "Expert Review" section of the site where users can go through the discrepancies and fix them. It would be great to have this functionality back.

PmasonFF commented 3 months ago

Adding to the above, I see the --explanations argument and code was removed. We are using this to create our own "Expert Review" section of the site where users can go through the discrepancies and fix them. It would be great to have this functionality back.

rbuhn, Are you aware of this code https://github.com/PmasonFF/Reconcile-Editor? A description and some explanation is here: https://www.zooniverse.org/talk/1322/2729009. And Yes, it uses the explanation column so needs to use the older version of reconcile.py. Feel free to contact me if I can be of assistance. Peter

rbruhn commented 3 months ago

@PmasonFF Hi. Yes, I saw it the other day. Nice work! I built a gui in Biospex for our users. It simply pulls those records that have transcription problems and displays the image and available answers. It gives the user the option to select/confirm one of the answers or enter their own. We call it the Expert Review.

Unfortunately, we do use v0.4.3 because it was using python3.8 and needed the explanations argument. Since we upgraded to Ubuntu 22.04 and python3.10, I was hoping to get the same functionality for this code. For now, I installed a copy of the working version we had on the old server, installed Python3.8 by hand, and it appears to be working. The install, even with 3.8, for v0.4.3 doesn't work anymore.

I'm not a Python programmer so hoping these issues get resolved. For now, I will attempt to get v0.4.3 working with Amazon Lambda and keep it static there.