jpahullo / moodle-tool_mergeusers

Merge users script for Moodle
https://moodle.org/plugins/view.php?plugin=tool_mergeusers
22 stars 52 forks source link

Creation of a web service for Merge user accounts plugin #218

Open nvallinoto opened 1 year ago

nvallinoto commented 1 year ago

Hi all, I'm working on moodle development through an external application using web services to add new user (core_user_create_users), to enrol users (enrol_manual_enrol_users), to get users' grades (gradereport_overview_get_course_grades) and so on. All is working well. Now I need to use merge user accounts plugin not only via admin interface but also through the external application. According to your experience is it possible to create a web service for Merge user accounts plugin ? Which are the possible critical issues ? Thanks.

jpahullo commented 1 year ago

Hi @nvallinoto ,

I think this is a new funcionality that can be really nice.

Things to consider:

  1. Merging users may take so long time (> 30 minuts on big instances). So, there should be two kind of web services:
    1. queue merging request: like: from: email, bla@bla.com, to: username: 12345678. This web service will end instantly.
    2. ask for result, if any: from: email, bla@bla.com, to: username: 12345678. So, this would answer something like: not processed, yet; in progress; concluded: with success|error, and attaching the logs.
  2. To enable this asynchronous processeing, this should implement, like in fòrum, an scheduled task to review pending mergings from the queue, and generate new adhoc_tasks for every merging record from the queue. This way, Moodle cron would be managing all these processes on the background.
  3. I'd rather move the current admin web to queue the merging requests, and also those from the CLI.

Web services as such would be easy to implement. The key issue is to adapt the plugin to support them.

nvallinoto commented 1 year ago

Hi @jpahullo, thanks for your suggestions. I could try to implement it although this is my first time writing a web service for moodle. If you can drive me I can try. The problem as I can see is that the plugin doesn't have a queue system but executes immediately the users' merging. So we need to modify quite heavily the plugin adding new functions to enable the asynchronous processing you proposed .. don't know if I'm able to arrive to the end.

jpahullo commented 1 year ago

Hi @nvallinoto !

No worries. We agree: The big issue is changing from synchronous to async merging users.

It is also interesting for our institution, so sooner or later, queueing merging request will land over here.

Thanks for your comments and suggestions.

nvallinoto commented 1 year ago

Hi @jpahullo, if we are going in the direction to create a web service (sooner or later) following your suggestion to enable asynchronous processing I think could be useful to indicate a road map with the main functions (with input and output) to be write in order to abilitate the plugin to work in a such way. Could you start to write a draft document with these steps/pieces of software needed (I imagine that some of the existing code could be reused) ? In this way me or some else interested in this new functionality can start to work on it. What do you think ?

jpahullo commented 1 year ago

It is a good idea, @nvallinoto. Thanks for the proposal. When I was able to write it, I'll do. I think it would be something like a list of tasks posted here.

Finally, the action to merge two users is done by a specific function call. So, all the new parts would be new code, and then the adhoc task would be calling this function. Mainly.

Thanks a lot for the idea!

nvallinoto commented 1 year ago

Hi @jpahullo, if and when you will have time could be the best way to proceed because you are the one who knows better the merge user accounts plugin.

jpahullo commented 1 year ago

The working path to implement a new webservice for merging users could be as follows:

Previous considerations:

  1. Merging users may take so long time (> 30 minuts on big instances). So, there should be two kind of web services:
    1. queue merging request: like: from: email, bla@bla.com, to: username: 12345678. This web service will end instantly. It could return an id for the merging request, which could be used afterwards for its status. Otherwise, any error would be exposed. We could name it as merging request id. Thinking about it, better to return an id, given that we could queue several request with the same data fields.
    2. ask for result given the previous provided merging request id. So, this would answer something like: not processed, yet; in progress; concluded: with success|error, and attaching the logs.
  2. To enable this asynchronous processeing, this should implement, like in fòrum, an scheduled task to review pending mergings from the queue, and generate new adhoc_tasks for every merging record from the queue. This way, Moodle cron would be managing all these processes on the background.
  3. I'd rather move the current admin web to queue the merging requests, and also those from the CLI.
  4. Web services as such would be easy to implement. The key issue is to adapt the plugin to support them.

So, for the part of web services on its own:

  1. queue_merging_request:
    1. Input data (all mandatory):
      1. remove_user_field: string; => this should contain values like username, id, idnumber or any other value currently supported.
      2. remove_user_value: string; => this should contain the given value according to the above reomve_user_field;
      3. keep_user_field: string; => the same as for remove_user_field;
      4. keep_user_value: string; => this should contain the given value according to the keep_user_field;
    2. Output data:
      1. id: int; the merging request id from the queue.
        • Or alternatively the error exposed by Moodle web service call.
  2. get_status_for_merging_request:
    1. Input data (all mandatory):
      1. id: int; the merging request id return by queue_merging_request
    2. Output data:
      1. (mandatory) status: int; status of the request. To enable automatic process from client side.
      2. (mandatory) status_description: string; human readable description of the status.
      3. (optional) logs: string; the JSON with the list of actions performed while merging users. This will only be returned when the request was completed (with sucess or error).

We could think about this statuses (the seen number from the ordered list could be the status id):

  1. missing merging request id. This id really does not exist.
  2. queued, not processed.
  3. queued to be processed (i.e., the adhoc_task is created for it)
  4. in progress; not concluded. (i.e., its adhoc_task has started)
  5. tried with error; pending of retrial. (i.e., the merging process has found some error and the adhoc_task is queued to be retried again in the future).
  6. completed with success.
  7. completed with errors. (i.e., even with all the current retries, the merging process could not end with success).

To just implement those web services, you could even not interact with a real merging request, just return a predefined skeleton. For instance:

For how to implement web services on Moodle in any plugin, please follow the official documentation from Moodle, and check existing web services implementation.

About users and tokens: This functionality is very sensitive, so we should add a comment on the README.md for asking Moodle administrator to build a new token for this web service manually via web. That generated token should be pass with a full URL to both web services to the client side, so that they can ask for merge users. We can also add a note that IT service to firewall those calls allowing them only from the specific client side and to the Moodle URL.

This would be a great effort for having those web services implemented.

For a real merging request implementation (queue, scheduled tasks and so on), we would only need to replace the code that generates the predefined code dealing with a real data model (new tables on databases and so on).

jpahullo commented 1 year ago

Missing drafts are:

  1. Other web services: are maybe other web services necessary? Like listing the current queue of merging requests and not just the one given its id?
  2. Queue: passing from current implementation to queue merging requests. Affects to:
    1. Adding new mergings via current web into the queue.
    2. CLI mergings have to be queued.
    3. Web services (if any) have to queue new requests.
    4. The table for this queue should contain these columns: remove_user_field, remove_user_value, keep_user_field, keep_user_value, timeadded, status (could be defined with default value to queued, not processed yet), timemodified, retries (the current number of retry). The above status_description should be a internationalized string.
  3. Scheduled task: pass from merging during user time, to queue the merge request.
    1. Define a scheduled task. The scheduled task should create an adhoc_task for merging every single merging request from the queue. So, this scheduled_task should add a new adhoc_task for each record from the queue. And this should update its status. This should only create adhoc_tasks for queue entries with status "not processed yet".
    2. Define an adhoc_task: each adhoc_task should mainly run the current merging process implementation. Adhoc_task, when fail, they are queued by Moodle to be repeated until it is succesful. We should implement a plugin setting to define maximum repetitions of any single merging request in case of error. I personally default it to 3. The implementation of the adhoc_task should consider this value always, so that in case of reaching this number of repetition, adhoc_task should not throw an error, and end it "as if it was correct", setting the specific status to that request. This number of default number of retries can be overriden on config.php from Moodle core: so we should document how to do that also for administrators. There would be a preface within the adhoc_task that have to try to identify the user.id for the remove_user and keep_user. This new part have to face scenarios where the given user may not be found and merging process would end. Cases to deal with on this preface:
      1. Remove user is missing, keep user is missing. Nothing to do. It should make an error, so the adhoc_task is queued again to be repeated. Maybe the users are not yet loaded into Moodle?
      2. Remove user is missing, keep user is present. Nothing to do. It is already ok. mark request as completed with success.
      3. Remove user is present, keep user is missing. Special case: if the field is username, we could simply update the user record with the keep_user_value (the new username) and the merging process would end. Similarly for email or idnumber. If the field is the user.id, then, the merging process should end with error. Mark request status accordingly.
      4. Remove user is present, keep user is present. Proceed as it does already with the given user.id for both users. Mark request status according to the presence or absence of errors.
  4. List of merging logs: this page should probably include the list of queued merging requests and showing the specific status, and also mix them with the list of already performed merging processes. Here, to me, the data model is not at all clear right now. Should we migrate the current logs into a single queue of merging requests? so that only a single table is retrieved from that web page, and the whole process for queueing and retrieving logs is simplified as much as possible. Pending merging request will have no logs and other related details for completed merging processes. I think we should go this way. To do so, we also should implement a process to migrate the current logs into the new queue table (or adapt the current logs table, rename it and adjust its structure to the new requirements).
jpahullo commented 1 year ago

Hi @nvallinoto,

I've done some kind of drafts as comments on this issue. I think the one about web services would be sufficient to start with.

Thanks!

nvallinoto commented 1 year ago

Hi @jpahullo, impressive list of things to do. Are there any activities concerning the database too or the queue of merging requests is managed by existing tables?

jpahullo commented 1 year ago

ehehehe, if you read it with calm, you can see that there must be changes on the tables related to this plugin. Mainly the one is related to the queue of merging requests, that I think should be the combination of the data fields related for a merging request and the current information existing on the merging logs (see my last comment about it).

jpahullo commented 1 year ago

But to just get working web services, without login in the background, you can implement it without touching anything of the plugin itself.

Thanks to you @nvallinoto !

nvallinoto commented 1 year ago

ehehehe, if you read it with calm, you can see that there must be changes on the tables related to this plugin. Mainly the one is related to the queue of merging requests, that I think should be the combination of the data fields related for a merging request and the current information existing on the merging logs (see my last comment about it).

you are right .. in the final point of the missing draft.

nvallinoto commented 1 year ago

dear all, I would like to start developing the new web services with the proposals made by Jordi. Before starting I'll share in this thread a diagram summarizing the contents of this possible improvement of the plugin in order to collect any further suggestions.
Nicola

nvallinoto commented 1 year ago

20230216_EN_WebServices_MergeUserAccounts.pdf This is the diagram of the proposed web services for Merge user accounts plugin. Please have a look and react with your critics and improvements.
@jpahullo I followed your suggestions.

jpahullo commented 1 year ago

Hi!

thanks for sharing your thoughts and proposal.

Regarding the pdf schema: it’s very good in the overall.

There are a couple of things I’d suugest or ask though:

  1. Web service get_queue_list_merging_request has some kind of filtering parameters, right? If so, they are not so well specified, and I’d suggest to provide the list of parameters expected and optional the query can be filtered by. For instance, if no argument is provided, the whole queue will be returned. Instead, if an id is provided, only thar item will be retirned; or if remove_user_field, remove_user_value are provided, only the list of requests matching those two values will be returned. So, basically, I’d start with defining optionally all columns you are defining in the queue (except for logs).
  2. The table definition has not defined the ‘logs’ column. Remember that, finally, the same queue will contain new requests but also requests succesfully processed. Filtering by status will be suficient to know the list of requests in every status.
  3. consider all this information to build correctly an index on the queue table, so that sql queries will run optimally from performance point of view. Maybe several indexes are required.

As I said, all is pretty pretty good. Good job!

nvallinoto commented 1 year ago

Hi, thank you very much for your suggestions. The point 1) is ok for me. Regarding the point 2) what is the content of the log field? I will update the diagrams. Nicola

jpahullo commented 1 year ago

The column logs contain information related to time started and end of the merging operation, as well as all those SQL sentences executed. It is a column of type text field and its content is a JSON, with an array of strings, one array position per every log operation during the execution of the merging operation. For the purpose of the diagram, it will be sufficient to add the column logs, as the rest of the fields.

Thanks!

nvallinoto commented 1 year ago

Hi @jpahullo, in attachment you will find the updated diagram of the new web services with your last replies. 20230222_EN_WebServices_MergeUserAccounts.pdf If no other suggestions is arriving I would like to start to write some code. Which is your suggestions to start (just to fork the last version and starting to develop at local server )? Nicola

jpahullo commented 1 year ago

Hi @nvallinoto,

I think it is better now.

Today I realized that two web services were somehow overlapping each other: the two get_…

I think we can just keep the last one, more versatile, and the one really necessary. With that one, we can still ask for an id, and it will return an empty list or just a list with one item.

That’s my opinion.

You can start developing as you say. That’s perfect.

Work first on the web services part as I suggested. Leave the conversion of current merging into scheduled and adhoc tasks for the last.

Let’s go!!

nvallinoto commented 1 year ago

you are right the second web service is included in the third one. I removed the second one 20230224_EN_WebServices_MergeUserAccounts.pdf . Here the new updated diagram.

jpahullo commented 1 year ago

Let’s go!!!!

nvallinoto commented 1 year ago

Hi @jpahullo, I wrote the interface of the two API: queue_merging_request and get_queue_list_merging_request Now I see them in the API documentation moodle page. / Next step is the creation of Moodle table to manage the queue of the merging requests. I have a question: how to add a sequence in file install.xml inside db folder ? In attachment mergeusers_api_table.txt you will find the possible content of the update.xml file. Thanks, Nicola

jpahullo commented 1 year ago

Hi @nvallinoto ,

Thanks for working on it. Thanks a lot.

I revisited your xmldb file. In general, it seems pretty good. the indexes for both values will help a lot if we allow admins to search requests through the queue.

I also need to make some comments:

  1. the length of the removeuservalue and keepuservalue are arbitrarily of varchar(30). Why? I can see that user.idnumber is a varchar(255). So we have to update them to that maximum.
  2. since the queue have to serve for both cli/your bright new web services/current plugin web, we need to add also two more fields: removeuserid and keepuserid: these two fields has the whole sense when an admin is using the current plugin web for searching and defining the couple of users to merge. So, when using the current plugin web all 6 fields will be populated by the web form. This is not something you have to do, but we need to consider right now. In other case, both fields can be perfectly null. When merging the users, the process could also update these fields to set them up with the proper user.id too.
  3. You forgot to add the field retries as stated in the above comment. It is necessary since the adhoc task that will process the request, need to account how many retries exist for this request. Moodle retry any failing adhoc task since it finishes with success. We will need to adjust this default behaviour by managing the number of retries internally.
  4. Originally I was thinking on the field statusdescription as a storage of a key from a internationalized string. However, we can build that string using the status code, and having internationalized string like $string['status_1_description'] = "Request queued"; for a $status = 1, and so on. Then, we can safely remove the field statusdescription.
  5. We need two more indexes: by timecreated and timemodified and status. When listing the queue you problably would like to see the last updated or the last created to see how it went. Without indexes, that listing would be very expensive. The two indexes should include {timecreated, status} and the other {timemodified, status}. This way, we will be able to filter by time first, and then by status (in progress, completed, completed with errors, etc.)

Just as an additional comment:

For web services and CLI processes may be able to have certain search criteria, I mean, a certain value pair {removeuserfield + removeuservalue} and/or {keepuserfield + keepuservalue} that may return more than a user record. In those cases, the request cannot continue and have to abort, without any retry. Just abort. This is a new state that we have to consider for the field status.

Consider only those details that affect to your development. I provides a lot of details and "why"s, to help you understand what we are looking for behind any decision.

Thanks again,

Jordi

nvallinoto commented 1 year ago

Hi @jpahullo, thanks a lot for your very useful comments. I introduced all of your suggestions in database schema. You will find it in attachment. 20230320_mergeusers_api_table.txt In my previous comment I asked how (if it is possible) add a sequence in the schema. The field id has a related sequence <FIELD NAME="id" TYPE="int" LENGTH="10" NOTNULL="true" SEQUENCE="true" NEXT="removeuserfield"/> The question is where and how to define the sequence in order to be created when you'll install or update the plugin ? You need to create it manually in the database ? I have seen other examples of install.xml in Moodle code but the sequence are not present. Nicola

nvallinoto commented 1 year ago

Hi @jpahullo,

I have updated the diagram 20230317_Moodle_MergeUserAccounts_API.pdf with the two web services following the last suggestions regarding the columns definition of the new Moodle table.

I have seen your request on Telegram channel (https://t.me/moodledev/122358). I wrote the name of the web service functions in admin/tool/mergerusers/db/services.php and the related methods in admin/tool/mergerusers/classes/external.php

I don't know if the way I just described is the component_class_callback (IIRC) or the subplugin mode to do it.

After that I wrote a new table to be added to the file admin/tool/mergerusers/db/install.xml for a fresh install of the plugin and to the admin/tool/mergerusers/db/upgrade.php for a plugin's update. I already tried to update the Moodle database using the upgrade.xml file but after adding +1 to $plugin->version in version.php the new table hasn't created.

Nicola

jpahullo commented 1 year ago

Hi @nvallinoto ,

Thanks for the work. The table schema seems ok for me.

The XML file should be db/install.xml, produced by the XMLDB Editor, as you did.

The upgrade file should be db/upgrade.php. See this plugin or other plugins on how to perform that. Also, XMLDB Editor helps you producing the PHP code for the upgrade.php.

However, you have to think on this: any Moodle instance that is having this plugin already installed, db/upgrade.php must produce the specific new table schema, and all existing data migrated. So, you have to think on a process for the db/upgrade.php to pass the table and any existing data from the table named tool_mergeusers to the tool_mergeusers_queue, so that no data is lost.

This change is a big change, so take care on producing the db/upgrade.php. Also, the table schema is not the same, so there would be empty/null fields, like for the removeuserfield and removeuservalue and the equivalent for the keepuserfield and keepuservalue. timecreated and timemodified would be the same. success field would be encoded as the corresponding status code. And the log field will be kept. You can imagine a process of producing a new fresh table and migrating the whole data or renaming the existing table and producing the changes column by column.

In the end, we have to consider the observers too, so that the success and failing merging operations trigger the corresponding events.

All this is a lot of work. But I think you'd better build your part for the queue table, necessary for your webservices, produce tests, and then go on for the rest of the changes, if you are able to.

Any little contribution will be very welcome.

Last, but not least: your questions:

Thanks again,

Jordi

nvallinoto commented 1 year ago

at the moment I will continue to develop without thinking to the data to be exported from the actual table to the new one. I have one more detailed question concerning the sequence. After the table is created a sequence is created automatically with the definition "sequence = true" or I need to create manually ? I had a look to the database and there are as many sequences (in folder sequences) as tables with a sequence. All tables have sequence.

jpahullo commented 1 year ago

No no, as I said, the id column is per se an auto-increment index. You have to do nothing, but install/generate that table. As simple as that.

Thanks

nvallinoto commented 1 year ago

I created the table _tool_mergeusersqueue using upgrade.php and I have seen that magically the sequence tool_mergeusers_queue_id_seq was created too. Well. Let's go on.

nvallinoto commented 1 year ago

the first web service tool_mergeusers_enqueue_merging_request is working well. I added the values of the two fields removeuserid and keepuserid too.

jpahullo commented 1 year ago

Congrats! Go, go go!

nvallinoto commented 1 year ago

refining the web service perhaps could be better to handle errors as written in this documentation page https://docs.moodle.org/dev/Errors_handling_in_web_services ? Do you have a simple example of it ?

jpahullo commented 1 year ago

What do you mean by "refining web services... to handle errors"?

In general, if some error occurs during the execution of the web service, it must behave as nothing happened and returning the error back to the client.

As the documentation states, if during the execution of the web service, an exception is through (anything being or extending moodle_exception()) will automatically fill the error fields and details back to the client.

So, in your web service implementation, you only need to through a new moodle_exception whenever necessary. The rest is managed by Moodle for you.

So, for instance, queueing a new merge request should happen always. Always i always. It's only adding a record into the table. That's it. Nothing else. So, it will happen always EXCEPT the moodle database is unaccessible or the table does not exist (for any reason). In that case I think Moodle would throw an exception automatically, so you should not have to take care of it.

For the other web service, asking the status of a given merge request should behave similarly as before. This web service only asks for the status of a given record or records.

Thanks for the work,

Jordi

nvallinoto commented 1 year ago

Thank you for your comment about the way moodle works to raise exception from server to the client. Concerning the other web service we decided to include the second in the last one (more inclusive and just not to have two web services with same results in part). So the other web service will return an object that is a list. Have a look to the last diagram https://github.com/ndunand/moodle-tool_mergeusers/files/11019158/20230317_Moodle_MergeUserAccounts_API.pdf and let me know if it's ok for you.

jpahullo commented 1 year ago

Yes @nvallinoto , all is ok.

My words were thinking on this lasr change: in total we have 2 web services.

The second one returns always a list. But we know that if we asked for a request id, the list eill be empty or just one item. Instead, if we asked by a given username, the list could be empty, containing one or more elements in the list.

Thanks!!!

nvallinoto commented 1 year ago

Hi Jordi, I terminated the second web service. I need only to validate the array with the parameters but the rest is working well and I added as optional parameters the removeuserid and keepuserid too! Next week I'll start the task process.

jpahullo commented 1 year ago

Good to know!

Before going on the task part, consider, please, adding some tests for the web services logic.

New code should be accompanied with tests.

Thanks for the work @nvallinoto

nvallinoto commented 1 year ago

Hi, I did some tests using postman. Or you mean other type of test ? in the case how to add them to plugin ? Nicola

jpahullo commented 1 year ago

Yeah, postman, curl, etc are good at manual testing. And they are good!

What I mean is adding phpunit or behat tests. There are some added already to the plugin (see directory tests/). How to pass/execute them? You need to look for it on the moodledev docs site.

These tests are also code delivered within the plugin, that says what is expected to happen.

In your case, for this part of web services, you do not need to test the whole web service invocation, but only test the logic of inserting the record in the table and the one getting some records from the queue table, respectively.

If you link me here the branch in your repo, I could help you a bit more with my explanations.

Again, thanks!!!!

Jordi

nvallinoto commented 1 year ago

ok I will have a look to directory /test to understand how to add phpunit or behat tests for a plugin. As soon as I copy the plugin code in my repo I will let you know.

nvallinoto commented 1 year ago

Hi Jordi, you will find the file added or modified here https://github.com/nvallinoto/moodle-tool_mergeusers (version.php, CHANGES.md, db/services.php, db/upgrade.php, classes/external.php) Nicola

nvallinoto commented 1 year ago

Updated diagram (28.03.23) of API Moodle interface to extend the plugin Merge user account

20230328_Moodle_MergeUserAccounts_API.pdf

nvallinoto commented 1 year ago

Hi Jordi,

following the diagram (https://github.com/ndunand/moodle-tool_mergeusers/files/11088703/20230328_Moodle_MergeUserAccounts_API.pdf) I wrote the _get_queue_mergingrequest task and now I need your help to define the adhoc task _merge_useraccounts

class merge_user_accounts extends \core\task\adhoc_task {
    /**
     * Execute the task.
     */
    public function execute($usertokeep, $usertoremove) {
        // do the real merge user accounts
    }
}
jpahullo commented 1 year ago

Hi @nvallinoto ,

If you have implemented the web services, this would conclude in a PR to solve this issue.

Prior resolving this issue, we should do an upgrade process of any existing log into the new table of merge requests.

This wouldn't be integrated into master, but into another different branch (this will be my job), since it will be a work-in-progress branch to include all changes from web services + tasks, to adapt completely the current behaviour into merge requests and tasks.

Then, the part of scheduled tasks + adhoc tasks comes on #207. So, comments on scheduled tasks will be on #207 now on.

Thanks a lot,

Jordi

nvallinoto commented 1 year ago

Regarding the webservices the right conclusion needs the test part to be included. Any help is welcome because it's my first attempt to do it. Any suggestion is welcome. Concerning the update process (moving the log to the new table) can you describe it better ? Thanks for all.

jpahullo commented 1 year ago

You're right about tests. I let you know @nvallinoto about your doubts. I have to go now.

Thanks a lot,

Jordi

nvallinoto commented 1 year ago

Hi @jpahullo in order to implement the web service in the correct way you wrote that "we should do an upgrade process of any existing log into the new table of merge requests." (https://github.com/ndunand/moodle-tool_mergeusers/issues/218#issuecomment-1491811363) Are correct the following links between the table "tool_mergeusers" fields and the new one ?

nvallinoto commented 1 year ago

Hi @jpahullo, I've updated the diagram which summarizes graphically the way I've followed in writing the code. 20230403_Moodle_MergeUserAccounts_API.pdf Please let me know if it makes sense for you. Nicola

jpahullo commented 1 year ago

Hi @nvallinoto,

Regarding to the migration process:

Thanks a lot,

Jordi