Himalayan-Academy / Hinduism-Today

Tracking Hinduism Today
0 stars 0 forks source link

Hinduism Today / HPI Migration Job #15

Open soapdog opened 3 years ago

soapdog commented 3 years ago

I'm using this github issue to document the migration job for the Hinduism Today and HPI data into the new Wordpress based site.

This is mostly so I'm able to document the process and all that was done.

cc @JaiNatha

soapdog commented 3 years ago

@JaiNatha migrated the old Hinduism Today website to dev.hinduismtoday.com but mentioned to me that some Virtualmin setting was missing and preventing the site from launching.

I just spent a couple hours trying to understand what was happening and why the hindupressinternational.com website was coming up when we accessed the dev.hinduismtoday.com domain.

soapdog commented 3 years ago

I noticed that dev.hinduismtoday.com has no Apache Virtual Host associated with it, even though Virtualmin options for enabling Apache is set on its web interface.

Apache Webserver — Webmin 1 973 (Ubuntu Linux 14 04 6) 2021-05-10 13-21-29 _This screen shows that there is no Virtual Host matching/home/htoday/public_html_

I went ahead and also looked on the hard drive of the server in /etc/apache2/sites-available, and could not find a suitable configuration file for dev.hinduismtoday.com.

I'm attempting to create one now.

soapdog commented 3 years ago

I created the virtual host file and the server is launching but it can't execute any PHP script. They all fail with Error 500. This is what is being logged:

[Mon May 10 06:31:06.627487 2021] [fcgid:warn] [pid 5060] (104)Connection reset by peer: [client 85.255.235.68:38330] mod_fcgid: error reading data from FastCGI server
[Mon May 10 06:31:06.627586 2021] [core:error] [pid 5060] [client 85.255.235.68:38330] End of script output before headers: test.php

The reason behind needing PHP to run on the old server is so that I can run either PHPMyAdmin or Adminer and work the exports files. They can't be worked out using virtualmin alone.

soapdog commented 3 years ago

Something strange is happening with the PHP installation on that server.

root@hinduismtoday:/home/htoday# php-cgi -v
PHP Warning:  PHP Startup: Unable to load dynamic library 'gd2' (tried: /usr/lib/php/20170718/gd2 (/usr/lib/php/20170718/gd2: cannot open shared object file: No such file or directory), /usr/lib/php/20170718/gd2.so (/usr/lib/php/20170718/gd2.so: cannot open shared object file: No such file or directory)) in Unknown on line 0
PHP Warning:  PHP Startup: Unable to load dynamic library 'gmp' (tried: /usr/lib/php/20170718/gmp (/usr/lib/php/20170718/gmp: cannot open shared object file: No such file or directory), /usr/lib/php/20170718/gmp.so (/usr/lib/php/20170718/gmp.so: cannot open shared object file: No such file or directory)) in Unknown on line 0
PHP Warning:  PHP Startup: Unable to load dynamic library 'intl' (tried: /usr/lib/php/20170718/intl (/usr/lib/php/20170718/intl: cannot open shared object file: No such file or directory), /usr/lib/php/20170718/intl.so (/usr/lib/php/20170718/intl.so: cannot open shared object file: No such file or directory)) in Unknown on line 0
PHP Warning:  PHP Startup: Unable to load dynamic library 'mysqli' (tried: /usr/lib/php/20170718/mysqli (/usr/lib/php/20170718/mysqli: cannot open shared object file: No such file or directory), /usr/lib/php/20170718/mysqli.so (/usr/lib/php/20170718/mysqli.so: undefined symbol: mysqlnd_global_stats)) in Unknown on line 0
PHP Warning:  PHP Startup: Unable to load dynamic library 'openssl' (tried: /usr/lib/php/20170718/openssl (/usr/lib/php/20170718/openssl: cannot open shared object file: No such file or directory), /usr/lib/php/20170718/openssl.so (/usr/lib/php/20170718/openssl.so: cannot open shared object file: No such file or directory)) in Unknown on line 0
PHP Warning:  Module 'gettext' already loaded in Unknown on line 0
PHP 7.2.17-1+ubuntu14.04.1+deb.sury.org+3 (cgi-fcgi) (built: Apr 10 2019 11:13:56)
Copyright (c) 1997-2018 The PHP Group
Zend Engine v3.2.0, Copyright (c) 1998-2018 Zend Technologies
    with Zend OPcache v7.2.17-1+ubuntu14.04.1+deb.sury.org+3, Copyright (c) 1999-2018, by Zend Technologies
    with Xdebug v2.7.1, Copyright (c) 2002-2019, by Derick Rethans
root@hinduismtoday:/home/htoday#
soapdog commented 3 years ago

Still trying to make it work...

soapdog commented 3 years ago

Found it.

The problem was that the suexec group was set incorrectly in the virtual host configuration. The data matched htadmin and not htoday. This caused suexec to kill the PHP process. The error in _errorlog could have been more explicit that it was a permission error. This just took me six hours to debug.

soapdog commented 3 years ago

Finding localized items to migrate

These are the smartsection categories that still need migration: 401, 402, 403, 404, 405, 406, 409, 415, 439, 448, 450.

I will need to create a new export script to pick them out.

soapdog commented 3 years ago

Got the data for all the localised articles. There are 173 localised Publisher's Desk articles. They have been saved to: public_html/andre/htoday-lang-export/ which is a temporary working folder on the server.

soapdog commented 3 years ago

Been working on the tool to import these localised Publisher's Desk articles into Wordpress. It is a bit trickier than the other articles because the data is not always saved on the database as Unicode.

image

The image above is a screenshot of a Gujarati version of the article. The title of the item is in Unicode but the content in the body and summary is using HTML character entities instead of Unicode characters. I kinda need to take that into account when processing the data.

soapdog commented 3 years ago

I think I can find the links to the other localised versions in the body of a given article. The problem is then extrapolating from that XOOPS itemId to what Wordpress link it will end up as. This is a bit tricky.