Closed caharding closed 6 years ago
This is a place-holder issue to hold 10% time for @willdoran or another dev to continue with some CREES research during the April and early May cycle as outlined in the roadmap. This will be assistance helping to gather or structure data from Uchaguzi so that the Open University can create a new data model specific to elections. The works should be fairly light with relatively little coding: but we wanted to be sure and preserve the time!
@Shadrock to add additional context in this scope and to open a bug for additional technical info from the partner
Just noting that this issue will be on a short pause until Will returns from vacation.
For @willdoran (when he gets back!).
This feedback on potential CREES integration issues came to me via e-mail from @evhart. The issues were compiled as part of an evaluation that was done, I believe, as a simulation with students in the Netherlands as part of COMRADES. Several Ushahidi staff supported this simulation: Robbie, Will, and David were involved specifically.
- There were many examples where only the related/non-related categories are the only present categories (information types missing).
- Service name displayed rather than the category (e.g., 'Info Type' instead of one of the possible information types categories).
- Different categories were returned for the same post. This was mentioned by students in their report. This should not be possible. Could it be due to using CREES and CREES Uchaguzi together?
Another issue found when exporting the Uchaguzi data using SQL for Kenny (from COMRADES) was linked with the encoding of the LAT/LON data. We had some discussion with Will, Robbie and David Losada about it but we could not get the correct LON/LAT data. See the SQL example below:
SELECT ASTEXT(value) as location, x(value) as x, y(value) as y FROM post_point LIMIT 10
location x y
POINT(4.764590900135562e19 -1.3068462330237261e-78) 4.764590900135562e19 -1.3068462330237261e-78
POINT(4.764590900135562e19 -1.3068462330237261e-78) 4.764590900135562e19 -1.3068462330237261e-78
POINT(-1.189199587984945e18 7.612047485286466e218) -1.189199587984945e18 7.612047485286466e218
POINT(-1.189199587984945e18 7.612047485286466e218) -1.189199587984945e18 7.612047485286466e218
POINT(-1.189199587984945e18 7.612047485286466e218) -1.189199587984945e18 7.612047485286466e218
POINT(-2.1201109403518694e167 -1.2673912903807526e166) -2.1201109403518694e167 -1.2673912903807526e166
POINT(4.764590900135562e19 -1.3068462330237261e-78) 4.764590900135562e19 -1.3068462330237261e-78
POINT(-11665962360.439825 -2.5803972529263045e-165) -11665962360.439825 -2.5803972529263045e-165
POINT(-3.27237135808076e-194 -1.18416047305313e-190) -3.27237135808076e-194 -1.18416047305313e-190
POINT(-1.189199587984945e18 7.612047485286466e218) -1.189199587984945e18 7.612047485286466e218
@Shadrock Would it be possible to get links to Post examples for the above 3 issues? I can use the text of those posts to see what's happening specifically.
@evhart can you comment on the above, please? Do you have links to the post examples?
@willdoran @jrtricafort not sure if compiling Uchaguzi data should be a separate issue or go here. It will be used to train a new CREES model... so it's related to this ticket. Please advise if I need to create a new issue or provide further details.
@willdoran @Shadrock From the COMRADES test instance, I have the following examples:
@willdoran can you comment on whether we've addressed the bugs in this thread? Once that's done, we can close this issue. I do not foresee any further work on CREES for the COMRADES project.
@Shadrock We haven't addressed them yet, but I'll move them up and look at them next week during the retreat. I believe I have fixed them but I need to confirm this on the Comrades deployments. I'll update the issue when I test.
@Shadrock I believe these issue are fixed as I couldn't reproduce them on the newest setup of Comrades(the one that contains HDX).
Excellent, thanks @willdoran. I'm good to close this out and then re-open if they come up again.
CREES is automatic tagging so that posts are automatically tagged from a ML tool. Our grant requires that our partner OpenUniversity train data models for two use cases: Crisis & Elections.
We have an existing data model for crisis scenarios working. For our grant we need to hand over a new set of data for OpenUniversity to train an additional model.
OpenUniversity has a data set on Hurricane Harvey Ushahidi has a data set on Uchaguzi
We need to make sure the elections data is structured such that it can be used in OU ML tool.