ojsde / lucene

Plugin for Solr/Lucene support in OJS.
GNU General Public License v2.0
0 stars 8 forks source link

[OJS 3.2.1.4] Problem plugin Lucene Solr #14

Open diegomejia071 opened 2 years ago

diegomejia071 commented 2 years ago

Hi,

I did the installation of the Lucene/Solr plugin with Embedded Server Mode: Installation and Configuration, but when I enter and try the search endpoint http://xxxxxxxxx:8983/solr/ojs/search, it gives me the following error in the browser:

image

Also, when trying to configure the plugin from the plugins section of the OJS platform, it does not open and stays loading:

image

When rebuilding the index php tools/rebuildSearchIndex.php -d shows me the following error:

LucenePlugin: Reset index…done LucenePlugin: Indexing “Philology. Academic and cultural newsletter”

PHP Warning: Use of undefined constant STATUS_PUBLISHED - assumed 'STATUS_PUBLISHED' (this will throw an Error in a future version of PHP) in /plugins/generic/lucene/classes/SolrWebService.inc.php on line 178

And at the end the indexing crashes and shows the following error:

` PHP Fatal error: Uncaught Exception: DB Error: MySQL server has gone away Query: UPDATE submission_settings SET setting_value = ? WHERE setting_name = ? AND submission_id = ? in /lib/pkp/classes/db/DAO.inc.php:703 Stack trace:

0 /lib/pkp/classes/db/DAO.inc.php(231): DAO->handleError(Object(ADODB_mysqli), 'UPDATE submissi...')

1 /plugins/generic/lucene/classes/SolrWebService.inc.php(292): DAO->update('UPDATE submissi...', Array)

2 /plugins/generic/lucene/classes/SolrWebService.inc.php(213): SolrWebService->_indexingTransaction(Array, 200, 157)

3 /plugins/generic/lucene/LucenePlugin.inc.php(1178): SolrWebService->pushChangedArticles(200, 157)

4 /plugins/generic/lucene/LucenePlugin.inc.php(706): LucenePlugin->_rebuildIndex(true, Object(Journal), true, true, 'LucenePlugin: R...')

5 /var/www/htm in /lib/pkp/classes/db/DAO.inc.php on line 703

PHP Fatal error: Uncaught Exception: DB Error: MySQL server has gone away Query: UPDATE sessions SET user_id = ?, ip_address = ?, user_agent = ?, created = ?, last_used = ?, remember = ?, data = ?, domain = ? WHERE session_id = ? in /lib/pkp/classes/db/DAO.inc.php:703 Stack trace:

0 /lib/pkp/classes/db/DAO.inc.php(231): DAO->handleError(Object(ADODB_mysqli), 'UPDATE sessions...')

1 /lib/pkp/classes/session/SessionDAO.inc.php(109): DAO->update('UPDATE sessions...', Array)

2 /lib/pkp/classes/session/SessionManager.inc.php(185): SessionDAO->updateObject(Object(Session))

3 [internal function]: SessionManager->write('5oc1ldi6o93m7uf...', '')

4 [internal function]: session_write_close()

5 {main}

thrown in /lib/pkp/classes/db/DAO.inc.php on line 703`

And when this plugin is active, the journals are not displayed, they take a long time to load:

image

ronste commented 2 years ago

Hi @diegomejia071 ,

this is indeed very unusual behaviour. On our servers the lucene plugin is running well with OJS 3.1.2-4.

The error messages and warnings you show point to a severe disruption of the system. In particular the warning "PHP Warning: Use of undefined constant STATUS_PUBLISHED - assumed 'STATUS_PUBLISHED' (this will throw an Error in a future version of PHP) in /plugins/generic/lucene/classes/SolrWebService.inc.php on line 178" puzzles me since the constant is defined in "lib/pkp/classes/submission/Submission.inc.php" which is not part of the lucence plugin.

I don't have a clue at the moment what might be the problem. How did you install the plugin? Did you clone it to your server or did you download a release package? For OJS 3.1.2 we don't have a release package and you currently have to use the master branch.

Please verify the release version and date in your version.xml file in the lucene folder are:

    <release>1.1.0.0</release>
    <date>2019-11-20</date>

If these are Ok you should have a closer look at the PHP error log. Look for any references to "lucene" that could help to pinpoint the problem.

Regards, Ronald

diegomejia071 commented 2 years ago

Hi @ronste,

I'm sorry I made the wrong version, the version we are using is 3.2.1.4. I would like to know, if I can do something in this version, so that the plugin works correctly, since we are very interested in using it.

Install the plugin, from the OJS Plugin Gallery. Activate the plugin and configure it according to the following documentation.

The version of Solr I use is 8.10.1.

Plugin version is:

<release>1.2.0.1</release> <date>2021-05-23</date>

ronste commented 2 years ago

Hi @diegomejia071 ,

I just installed the Lucene/Solr plugin on my OJS 3.2.1-4 test server from the plugin gallery (I didn't start the Solr server). I could open the settings dialogue without any problems.

My suspicion is that there might be another problem with your installation. The warning about STATUS_PUBLISHED and the DB error message point in that direction. Please try the following:

1) Shut down the Solr server and disable the Lucene/Solr plugin. 2) Make sure your OJS installation runs as expected, e.g. open settings dialogues from other plugins or do a search via the frontend. Or rebuild the index. Always have a look at the PHP error log! 3) Enable the Lucene/Solr plugin. The dialogue should open correctly without the Solr server started. If it doesn't open please send the related messages from the PHP error log. 4) If the Lucene/Solr settings dialogue opened correctly start the Solr server and open the settings dialogue again. If it doesn't open have a looked at the PHP error log again.

I hope with this procedure we can pinpoint the problem somehow.

Best wishes, Ronald

diegomejia071 commented 2 years ago

Hi @ronste,

  1. Shut down the Solr server and disable the Lucene/Solr plugin. Shut down the Solr server normally and disable the Lucene/Solr plugin.

  2. Make sure your OJS installation runs as expected, e.g. open settings dialogues from other plugins or do a search via the frontend. Or rebuild the index. Always have a look at the PHP error log! Everything works normally in my installation and the dialogs of the different plugins open without any problem. There are no error records in the php error log.

  3. Enable the Lucene/Solr plugin. The dialogue should open correctly without the Solr server started. If it doesn't open please send the related messages from the PHP error log. Enable the plugin again, without starting the server. The dialog opens, but it takes too long, also if I try to load the journal, it stays loading, and it shows me the following in the php error log.

PHP Warning: Declaration of LuceneSettingsForm::execute() should be compatible with Form::execute(...$functionArgs) in /plugins/generic/lucene/classes/form/LuceneSettingsForm.inc.php on line 0, referer: https://xxxxxxxxxx/vitae/management/settings/website image

And it shows me the following in the error log. PHP Fatal error: Allowed memory size of 1073741824 bytes exhausted (tried to allocate 380928 bytes) in /plugins/generic/lucene/LucenePlugin.inc.php on line 865, referer: https://revistas.udea.edu.co

  1. If the Lucene/Solr settings dialogue opened correctly start the Solr server and open the settings dialogue again. If it doesn't open have a looked at the PHP error log again. The solr server starts normally, as it should be: image

He dialog opens, but it takes too long.

image

Also if I try to load the journal, it stays loading. image

And it shows me the following in the php error log.

PHP Warning: Declaration of LuceneSettingsForm::execute() should be compatible with Form::execute(...$functionArgs) in /plugins/generic/lucene/classes/form/LuceneSettingsForm.inc.php on line 0, referer: https://xxxxxxxxxx/vitae/management/settings/website

ronste commented 2 years ago

Hi @diegomejia071 ,

that's a good starting point: "PHP Fatal error: Allowed memory size of 1073741824 bytes exhausted (tried to allocate 380928 bytes) in /plugins/generic/lucene/LucenePlugin.inc.php on line 865, referer: https://revistas.udea.edu.co ".

This means the Lucene/Solr-Plugin tries to allocate a rather small amount (380928 bytes) of memory but there is no more memory available, the size of 1073741824 is already exceeded. This also slows down the opening of the dialogue and the display of the frontend.

You should try to figure out at what point the memory limit is exceeded. I would suggest to repeat the above procedure and watch the memory consumption of the processes on your server. You mentioned above that already without the Solr sever started the system becomes very slow. So my suspicion is that the process consuming a large amount of memory is already running. It is probably not the Solr server itself.

Best wishes, Ronald

diegomejia071 commented 2 years ago

Hi @ronste,

But look, as I told you in the previous post, I just activated the plugin, but without starting the server, I already have these problems.

Enable the plugin again, without starting the server. The dialog opens, but it takes too long and it shows me the following in the php error log.

PHP Warning: Declaration of LuceneSettingsForm::execute() should be compatible with Form::execute(...$functionArgs) in /plugins/generic/lucene/classes/form/LuceneSettingsForm.inc.php on line 0, referer: https://xxxxxxxxxx/vitae/management/settings/website

Also if I try to load the journal, it stays loading and not open:

image

And it shows me the following in the error log.

PHP Fatal error: Allowed memory size of 1073741824 bytes exhausted (tried to allocate 380928 bytes) in /plugins/generic/lucene/LucenePlugin.inc.php on line 865

And all this happens only by activating the plugin, but if I deactivate it, the platform, the journals and the other plugins work correctly.

ronste commented 2 years ago

Hi @diegomejia071 ,

unfortunately I am not able to reproduce the problem on my test server. I can open or close the dialogue and/or frontend page without any problems and independent on any settings I enable. So I can only guess what my be happening on your system.

We can try another approach:

1) The Warning "PHP Warning: Declaration of LuceneSettingsForm::execute() should be compatible with Form::execute(...$functionArgs) in /plugins/generic/lucene/classes/form/LuceneSettingsForm.inc.php on line 0, referer: https://xxxxxxxxxx/vitae/management/settings/website" is not significant to the problem. If it disturbs you you can easily correct this by editing the file https://github.com/ojsde/lucene/blob/stable-3_2_0/classes/form/LuceneSettingsForm.inc.php#L125 on your server. Replace "function execute()" by "function execute(...$functionArgs)" and the warning disappears.

2) We have to pinpoint the exact point when the system becomes slow and unresponsive.

a) Does the system become slow already when you enable the plugin but not open the settings dialogue or only after you open the settings dialogue? This is an important distinction for identifying the line of code the might cause the problem.

b) The main issue is related to the PHP error message referring to [line 865 of in /plugins/generic/lucene/LucenePlugin.inc.php](https://github.com/ojsde/lucene/blob/stable-3_2_0/LucenePlugin.inc.php#L865). 

  This line is only executed when autosuggest is enabled in the plugin settings. Could you try to disable autosuggest and see what happens? (On my system I have no problems even with autosuggest enabled)

  If you have other options enabled please try to disable them and see what happens.

As I said before I am not able to reproduce the issue on my server. I am also not the developer of the plugin. I just did some minor maintenance. Currently I don't have any idea what the problem might be.

Maybe others might be able to help: @withanage , @asmecher

diegomejia071 commented 2 years ago

Hi @ronste,

The journal does not load me as soon as I enable the plugin.

PHP Notice: Undefined offset: 1 in o/plugins/generic/lucene/LucenePlugin.inc.php on line 468, referer: https://xxxxxxxxxx/index.php/ikala/search/search?csrfToken=10c6780437b58f40a04602293cd41996&query=lire PHP Fatal error: Out of memory (allocated 65011712) (tried to allocate 90112 bytes) in /plugins/generic/lucene/LucenePlugin.inc.php on line 865, referer: https://www.google.com/ PHP Fatal error: Allowed memory size of 1073741824 bytes exhausted (tried to allocate 376832 bytes) in /var/www/html/site.revistas.udea.edu.co/plugins/generic/lucene/LucenePlugin.inc.php on line 865

mmap() failed: [12] Cannot allocate memory

An error has occurred while someone was trying to search for the installation. The error message generated by the web service is: "Unfortunately the OJS search service is currently offline." This email was generated by Open Journal Systems' Lucene search module.

ronste commented 2 years ago

Hi @diegomejia071 ,

ok, we are getting closer to the issue. But I still don't see where this big chunk of memory is allocated. The next step I would try on my server is to debug the code line by line. Don't know if you can do this on your server.

Otherwise I am not able to help much any more. That's why I mentioned my colleagues above.

I would suggest you post this issue again at https://github.com/pkp/pkp-lib/issues. There are more people who could potentially help with this issue. Mention the OJS and plugin version numbers and include a link to our discussion here.

diegomejia071 commented 2 years ago

Hi @ronste,

When I want to re-index the search with php rebuildSearchIndex.php -d at some point the following error comes out.

`PHP Fatal error: Uncaught Exception: DB Error: MySQL server has gone away Query: UPDATE submission_$ Stack trace:

0 /lib/pkp/classes/db/DAO.inc.php(231): DAO->handleError(Object(ADODB_mysqli), 'UPDATE submissi...')

1 /plugins/generic/lucene/classes/SolrWebService.inc.php(292): DAO->update('UPDATE submissi...', Array)

2 /plugins/generic/lucene/classes/SolrWebService.inc.php(213): SolrWebService->_indexingTransaction(Array, 200, 45)

3 /plugins/generic/lucene/LucenePlugin.inc.php(1178): SolrWebService->pushChangedArticles(200, 45)

4 /plugins/generic/lucene/LucenePlugin.inc.php(706): LucenePlugin->_rebuildIndex(true, Object(Journal), true, true, 'Lucen$

5 /lib/pkp/classes/db/DAO.inc.php on line 703

PHP Fatal error: Uncaught Exception: DB Error: MySQL server has gone away Query: UPDATE sessions SET user_id = ?, ip_address = ?, user_agent = ?, created = ?, last_used = ?, remember = ?, data = ?, domain = ? WHERE session_id = ? in /lib/pkp/classes/db/DAO.inc.php:703 Stack trace:

0 /lib/pkp/classes/db/DAO.inc.php(231): DAO->handleError(Object(ADODB_mysqli), 'UPDATE sessions...')

1 /lib/pkp/classes/session/SessionDAO.inc.php(109): DAO->update('UPDATE sessions...', Array)

2 /lib/pkp/classes/session/SessionManager.inc.php(185): SessionDAO->updateObject(Object(Session))

3 [internal function]: SessionManager->write('n95lk5ue5ouuu0n...', '')

4 [internal function]: session_write_close()

5 {main}

thrown in /lib/pkp/classes/db/DAO.inc.php on line 703`

I don't know if the MySQL execution time could be, or the memory, the truth is I don't know where I could check.

Where do you want me to review or debug the code line by line?

Also because when it rebuilds the search index, there are journals to which it does not index anything. For example:

LucenePlugin: Reset index...done LucenePlugin: Indexing "Philology. Academic and cultural newsletter". 0 articles indexed done

ronste commented 2 years ago

Hi @ronste,

When I want to re-index the search with php rebuildSearchIndex.php -d at some point the following error comes out.

PHP Fatal error: Uncaught Exception: DB Error: MySQL server has gone away Query: UPDATE submission_$ Stack trace: #0 /lib/pkp/classes/db/DAO.inc.php(231): DAO->handleError(Object(ADODB_mysqli), 'UPDATE submissi...') #1 /plugins/generic/lucene/classes/SolrWebService.inc.php(292): DAO->update('UPDATE submissi...', Array) #2 /plugins/generic/lucene/classes/SolrWebService.inc.php(213): SolrWebService->_indexingTransaction(Array, 200, 45) #3 /plugins/generic/lucene/LucenePlugin.inc.php(1178): SolrWebService->pushChangedArticles(200, 45) #4 /plugins/generic/lucene/LucenePlugin.inc.php(706): LucenePlugin->_rebuildIndex(true, Object(Journal), true, true, 'Lucen$ #5 /lib/pkp/classes/db/DAO.inc.php on line 703 PHP Fatal error: Uncaught Exception: DB Error: MySQL server has gone away Query: UPDATE sessions SET user_id = ?, ip_address = ?, user_agent = ?, created = ?, last_used = ?, remember = ?, data = ?, domain = ? WHERE session_id = ? in /lib/pkp/classes/db/DAO.inc.php:703 Stack trace: #0 /lib/pkp/classes/db/DAO.inc.php(231): DAO->handleError(Object(ADODB_mysqli), 'UPDATE sessions...') #1 /lib/pkp/classes/session/SessionDAO.inc.php(109): DAO->update('UPDATE sessions...', Array) #2 /lib/pkp/classes/session/SessionManager.inc.php(185): SessionDAO->updateObject(Object(Session)) #3 [internal function]: SessionManager->write('n95lk5ue5ouuu0n...', '') #4 [internal function]: session_write_close() #5 {main} thrown in /lib/pkp/classes/db/DAO.inc.php on line 703

I don't know if the MySQL execution time could be, or the memory, the truth is I don't know where I could check.

Where do you want me to review or debug the code line by line?

Also because when it rebuilds the search index, there are journals to which it does not index anything. For example:

LucenePlugin: Reset index...done LucenePlugin: Indexing "Philology. Academic and cultural newsletter". 0 articles indexed done

Could you monitor the memory consumption on the server while you are trying to rebuild the index? Does it loose connection to MySQL due to lack of memory?

This would be interesting because we still don't know whether the memory issue is caused by backend code or while generating templates for the browser.

ronste commented 2 years ago

Where do you want me to review or debug the code line by line?

Depending on your system there are different options. You could use an IDE suitable for PHP with XDebug (has to be installed on the server) to do step by step debugging.

If you don't have this possibility you could add debug statements to the code that write to the error log. Something like this would be suitable (I didn't test this code):

error_log("My_DebugPoint: ".print_r(memory_get_usage(true),true)); 

You should start with the function register($category, $path, $mainContextId = null) in https://github.com/ojsde/lucene/blob/31b7cba593b3db5e04371da068caf7cddce4014a/LucenePlugin.inc.php#L102. This is the first function to be executed.

Put a debug statement at the beginning and the end of the function an look for the memory allocated. If it is still ok at the end of the function go ahead with the function callbackTemplateDisplay($hookName, $params) and do the same. Add additional debug statements to narrow down the line of code that might cause the issue.

diegomejia071 commented 2 years ago

Hi @ronste,

Yesterday's day, I managed to rebuild the search index for all my journals. The plugin seems to be working fine now.

With the memory thing, I have to check as you mentioned and debug the code, I also have to check if I can allocate more memory to my server, because our installation is too big and we have many journals on our platform.

I only have one doubt, because when the search index is rebuilt, there are some that are not indexed at all, that is, 0 articles or very few. As in this case, that I mentioned you in the previous post. Why is this happening?

LucenePlugin: Reset index...done LucenePlugin: Indexing "Philology. Academic and cultural newsletter". 0 articles indexed done

diegomejia071 commented 2 years ago

Hi @ronste,

And something very strange happened, at 11:27 since Solr was disconnected from the server, but at that time there was no strange event for this to happen. What could it be?

[Thu Jan 20 11:27:38.789470 2022] [php7:error] [pid 32614] [client 52.3.217.254:31670] PHP Fatal error: Failed to connect to 127.0.0.1 port 8983: Connection refused in /lib/pkp/classes/webservice/WebService.inc.php on line 228

Today he came back and the service of the sun woke up. I don't know what could be happening, or why this service is down.

[Fri Jan 21 00:09:33.622932 2022] [php7:error] [pid 17032] [client 185.191.171.18:4768] PHP Fatal error: Failed to connect to 127.0.0.1 port 8983: Connection refused in /lib/pkp/classes/webservice/WebService.inc.php on line 228

ronste commented 2 years ago

Hi @diegomejia071 ,

there could be different reasons for that. Without knowledge of your system and journals I can't really say anything.

ronste commented 2 years ago

Hi @ronste,

And something very strange happened, at 11:27 since Solr was disconnected from the server, but at that time there was no strange event for this to happen. What could it be?

[Thu Jan 20 11:27:38.789470 2022] [php7:error] [pid 32614] [client 52.3.217.254:31670] PHP Fatal error: Failed to connect to 127.0.0.1 port 8983: Connection refused in /lib/pkp/classes/webservice/WebService.inc.php on line 228

Today he came back and the service of the sun woke up. I don't know what could be happening, or why this service is down.

[Fri Jan 21 00:09:33.622932 2022] [php7:error] [pid 17032] [client 185.191.171.18:4768] PHP Fatal error: Failed to connect to 127.0.0.1 port 8983: Connection refused in /lib/pkp/classes/webservice/WebService.inc.php on line 228

Hi @diegomejia071 ,

again I can't really say much. But the error message clearly says "connection refused" and not "service not available". Clearly some service answered to the request but the connection was not allowed. So its not about a service being down.

Best wishes, Ronald

diegomejia071 commented 2 years ago

Hi @ronste,

I understand what it says in the message, what happens is that when this happens, the service no longer reconnects, and when I search one of my magazines, it appears blank. To get it to work again, I have to go and start Solr again.

In the image you can see that the service is already showing this message:

image

ronste commented 2 years ago

Hi @diegomejia071 ,

did you look at your apache access log? Is the request logged there or is it rejected before? If it is logged, any hints why the connection is refused?

diegomejia071 commented 2 years ago

Hi @ronste,

I have another problem.

After configuring the Lucene plugin in my OJS 3.2.1.4, which works perfectly when performing searches. I see that I receive different emails from some journals, with errors in the searches; which I don’t know why they happen.

The error message generated by the web service (if any): "Unfortunately the OJS search service is currently offline." This email was generated by Open Journal Systems' Lucene search plugin.

ronste commented 2 years ago

Hi @diegomejia071 ,

this message is sent whenever the lucene plugin was not able to talk to the solr server. There could be a number of reasons, including performance issues on the server, for this to happen.

In most cases this is a temporary issue. In case it persists you should monitor your server load and/or connections.

By the way, this plugin is now maintained by PKP and moved to https://github.com/pkp/lucene