WikiEducationFoundation / WikiEduDashboard

Wiki Education Foundation's Wikipedia course dashboard system
https://dashboard.wikiedu.org
MIT License
387 stars 603 forks source link

Help migrating away from ORES #5465

Closed isaranto closed 1 year ago

isaranto commented 1 year ago

Hi! I am part of the Wikimedia ML team, we are starting the migration of ORES client to another infrastructure, since we are planning to deprecate it. More info in https://wikitech.wikimedia.org/wiki/ORES

TL;DR:

The ORES infrastructure is going to be replaced by Lift Wing, a more modern and kubernetes-based service. All the ORES models (damaging, goodfaith, etc..) are running on Lift Wing, more on how to use them in https://wikitech.wikimedia.org/wiki/Machine_Learning/LiftWing/Usage We have new models called Revert Risk, to replace goodfaith and damaging for example. The are available on Lift Wing, and we'd like to offer them as valid and more precise/performant alternative to ORES models. If you'd like to try them we'd help in the migration process! Thanks in advance,

ML team

References to ORES in this codebase

elukey commented 1 year ago

I had a chat with @ragesoss over email about this and few things came out:

They use the batching API (single call, multiple scores requested) and probably migrating to parallel requests to Lift Wing may make the overall product experience worse.

With https://ores-legacy.wikimedia.org/docs we have already limited the amount of scores in a single query to 20, so even if we don't migrate them directly to Lift Wing some adjustment will be needed in the code.

@ragesoss maybe we could attempt to use ores-legacy, see how it goes, and then migrate completely to Lift Wing later on? Ores Legacy is basically a wrapper around Lift Wing that offers a very similar API to ORES, so it can be used as drop in replacement. A quick test could be to:

ragesoss commented 1 year ago

Are the docs correct that ores-legacy will only be available until December 2023? (or some unspecified time after that, but not indefinitely?) If so, we should probably just figure out how to migrate to Lift Wing sooner rather than later.

elukey commented 1 year ago

@ragesoss yes correct, ores-legacy is only a migration tool, we'll eventually turn it off (hopefully at the beginning of 2024). My proposal to use it was to have a quick way to avoid changing too much of your code and see how latency impacts the tool (since behind the scenes ores-legacy makes parallel requests to Lift Wing). It is easy to rollout and rollback, but of course moving to Lift Wing would be the best choice. The absence of caching (for the moment) will surely make an impact :)

Let me know!

We have some plans to rollout some basic http caching during the next weeks :)

ragesoss commented 1 year ago

I did a little more exploration on this, and I couldn't find any info about how to get the features data that the predictions are based on. This is one of the key things we currently use from ORES, eg, https://ores.wikimedia.org/v3/scores/enwiki/?models=articlequality&revids=12345|678910&features

We use that features data to keep track of how many references were added or removed in a given revision (by comparing the features for that revision with the features of the parent revision).

Is this data available from Lift Wing?

AikoChou commented 1 year ago

Hi @ragesoss! The data you need is available from Lift Wing.

With ores-legacy, you can use the following URL https://ores-legacy.wikimedia.org/v3/scores/enwiki?models=articlequality&revids=12345|678910&features=true to get the same data.

If you want to use the Lift Wing endpoints directly and not through ores-legacy, you will need to make a POST request, e.g., curl https://api.wikimedia.org/service/lw/inference/v1/models/enwiki-articlequality:predict -X POST -d '{"rev_id": 12345, "extended_output": true}'

with the extended_output parameter set to true to get the features data. However, multiple revisions input is not supported. For more information, refer to https://api.wikimedia.org/wiki/Lift_Wing_API/Reference/Get_revscoring_articlequality_prediction. :)

ragesoss commented 1 year ago

One problem I've found is that LiftWing output is considerably different for error conditions. For example, a revision that has been text-deleted on English Wikipedia is 708326238 (https://en.wikipedia.org/w/index.php?oldid=708326238 had its content hidden: https://en.wikipedia.org/w/index.php?title=Special:Log/delete&page=Philip_James_Rutledge ).

ORES output looks like this:

"enwiki" => {"models"=>{"articlequality"=>{"version"=>"0.9.2"}}, "scores"=>{"708326238"=>{"articlequality"=>{"error"=>{"message"=>"TextDeleted: Text deleted (datasource.revision.text)", "type"=>"TextDeleted"}}}}}

LiftWing output looks like this:

"error" => "Missing resource for rev-id 708326238: TextDeleted: Text deleted (datasource.revision.text)"

I'm trying to migrate to LiftWing by making requests for a set of revisions and then merging them to get the same shape of data as ORES returns for multiple revisions at once, but this throws a wrench in that plan.

isaranto commented 1 year ago

The error responses returned by Lift Wing follow the same schema for all inference services and each response is accompanied by the appropriate repsonse code. That is why we get the above response.

There is also another issue with the ORES reponses for requests like the above: the response code. In the above example https://ores.wikimedia.org/v3/scores/enwiki/708326238/articlequality returns a 200 response with the above text in the response while a 404 would be more appropriate.

Since Lift Wing follows a micro-service based architecture this means that any type of response aggregation should be handled by the client. For revscoring models there is already such code examples for python https://github.com/wikimedia/machinelearning-liftwing-inference-services/blob/main/ores-legacy/app/utils.py#L62 and we're here to help if needed!

Ores-legacy service will assist in the transition (ores.wikimedia.org will point there for a while) but eventually clients will have to move directly to Lift Wing.

ragesoss commented 1 year ago

Thanks. I'm making progress, just need to test out a few more things. For ORES, we were relying on the type property of the error message, and I think it would be easier for others in the future if LiftWing error responses also had an explicit type property for the errors. (In my code for the Dashboard, I'm now grepping for the error type from the message and then adding the type param to it, so that the Lift Wing API class is the only part that needs to care about the migration.)

ragesoss commented 1 year ago

There is also another issue with the ORES reponses for requests like the above: the response code. In the above example https://ores.wikimedia.org/v3/scores/enwiki/708326238/articlequality returns a 200 response with the above text in the response while a 404 would be more appropriate.

Maybe? A TextDeleted response means the data does exist and the revision ID is a valid one, it's just that the API is not allowed to provide you with the data you asked for. I'm guessing that the data is technically not available to LiftWing, so from the perspective of that service, it might make sense as a 404. From an end-user perspective thinking about it as part of the bigger Wikimedia data ecosystem, this particular one seems more like a 403 (forbidden). In any case, I appreciate getting as specific of an error response as possible for this kind of thing. Debugging Wikipedia data things often involve piecing together clues about what happened to a piece of content, and the more clear hints the better!

elukey commented 1 year ago

There is also another issue with the ORES reponses for requests like the above: the response code. In the above example https://ores.wikimedia.org/v3/scores/enwiki/708326238/articlequality returns a 200 response with the above text in the response while a 404 would be more appropriate.

Maybe? A TextDeleted response means the data does exist and the revision ID is a valid one, it's just that the API is not allowed to provide you with the data you asked for. I'm guessing that the data is technically not available to LiftWing, so from the perspective of that service, it might make sense as a 404. From an end-user perspective thinking about it as part of the bigger Wikimedia data ecosystem, this particular one seems more like a 403 (forbidden). In any case, I appreciate getting as specific of an error response as possible for this kind of thing. Debugging Wikipedia data things often involve piecing together clues about what happened to a piece of content, and the more clear hints the better!

From the point of view of Lift Wing, the data is not available in MediaWiki for a specific revision (since there is no text) so it makes sense to return a 404, that should indicate this info. An HTTP 403 is related to the API itself, namely that it doesn't authorize it. Usually this means that the client doesn't have the access rights to use the API (Lift Wing), it is not meant for other meanings IIUC. Finding the right semantic is not trivial, we are trying our best :)