mrAceT / nextcloud-S3-local-S3-migration

Script for migrating Nextcloud primary storage from S3 to local to S3 storage
GNU General Public License v3.0
67 stars 11 forks source link

Script aborting #11

Closed tomcatcw1980 closed 6 months ago

tomcatcw1980 commented 6 months ago

Hi There,

I don't want to duplicate, but don't know if my issue is here better placed:

https://github.com/nextcloud/server/issues/34407#issuecomment-1938179733

Greetings Chrstian

tomcatcw1980 commented 6 months ago

Hi mrAceT,

I modified some settings in the my.cnf so that the script ran to the end. But unfortunately I don't see any shared files. In the overview of files I see the shared folders with the size, but when I enter the folders, they are empty. Do you have any ideas?

occ files:scan + files:scan-app-data did not did the trick.

2024-02-12 12_10_48-Dateien - Team-Cloud

2024-02-12 12_12_06-Dateien - Team-Cloud

Thank you for your help.

Greetings Christian

mrAceT commented 6 months ago

Hi @tomcatcw1980,

First of all, it is rarely wise to add comments to a closed issue.

Second, in response:

first of all, thank you very much for your script.

You are welcome (buy me a cup of coffee ;) )

Who is the clouduser? www-data is probably not, because with sudo www-data I get the message that www-data is logically not sudoer. That's why I just left it at root.

The 'cloud user' is the user of you NextCloud installation. When you perform actions as root, all file-actions are performed as root and thus it is very likely your NextCloud installation does not have the rights to access the migrated files.. If you are unable to run as your NextCloud installation user, you will need to manually set the owner of files and folders to you nextcloud user!

What is meant by this variable? $PATH_BASE = ''; // Path to the base of the main Nextcloud directory. What has to go in there? or I have left it empty, otherwise the path is not correct and the script will not start.

Looking at the next line, the 'base path' in your case will be something like 'var/www/nextcloud'

What is the difference to this variable? $PATH_NEXTCLOUD = $PATH_BASE.'/var/www/nextcloud'; // Path of the public Nextcloud directory

I am guessing simeting like '/public' or '/public_html' (the web part of your NextCloud installation.

Is it intended in your script that it retains the shares etc.?

Good question! I expect that the NextCloud structure will retain shares. This because my script only redirects "the pointer" to the file from S3 to local (or the other way around).

PS: I just got this error message. It copied all data, then asked to continue: [...] /nc_data/appdata_ocoyq3be73jp/css/terms_of_service/6aaf-32d3-overlay.css.deps /nc_data/appdata_ocoyq3be73jp/css/theming/d71e-32d3-settings-admin.css.gzip

Continue?Y

I expect the few lines above your quote will be more reveling to me..

Copying files finished ######################################################################################### Modifying database started... PHP Fatal error: Uncaught mysqli_sql_exception: MySQL server has gone away in /var/www/nextcloud-s3-to-disk-migration/s3tolocal.php:367 Stack trace:

0 /var/www/nextcloud-s3-to-disk-migration/s3tolocal.php(367): mysqli->query()

https://github.com/nextcloud/server/issues/1 {main} thrown in /var/www/nextcloud-s3-to-disk-migration/s3tolocal.php on line 367

Yikes, does this happen every time? This means the connection to your MySql server has been lost. The part above line 367 regularly uses the $mysqli connection.. so I am thinking database corruption? place NextCloud in maintenance mode and perform some database repair actions to be sure on the table the error given in your line 367. If that doesn't do the trick, try and do a re connection (the part "connect to sql-database..." somewhere around line 80 and copy those lines directly below "Copying files finished") see if that does the trick.

tomcatcw1980 commented 6 months ago

You are welcome (buy me a cup of coffee ;) ) I really will. Tell me how.

But first, it would be nice, if you had any idea, where the shares have been gone.

By modifiying the my.cnf the script went till the end. When I log into the NC-instance with migrated data, the share don't work anymore. See the screenshots above Therefore I have >200 users I can't restore them manually by each users himself.

I ran occ sharing:delete-orphan-shares, then all share are not shown anymore because they are orphaned.

I would be very grateful for help

Greetings

mrAceT commented 6 months ago

If still possible I'dd restore the backup first (reconnecting to your S3, use the backup.sql)

tomcatcw1980 commented 6 months ago

I can all undone. I set up a second instance and can try and try.

What do you mean exactly: Shall I restore the backup.sql before running the script again?

mrAceT commented 6 months ago

Ah, you red my manual.. wow, people who read the manual and act on it really exist! ;)

So my theory the shares would remain in tact doesn't hold. Darn..

That will mean I will need to (re)build my test setup and find out how that share changes..

I need to dive into the share structure and find out what needs to be migrated :-/

About a year ago I have done some digging into shares (for my project https://github.com/GeoArchive) and need to brush up on it for my automation in creating an account in that platform (which uses NextCloud for the data).. but I haven't had the time for that yet.. in how much of a hurry are you?

mrAceT commented 6 months ago

AD: If you can "correct" one share and EXACTLY tell me what you've changed I'll try to build the code migration code on that, idea?

tomcatcw1980 commented 6 months ago

Ah, you red my manual.. wow, people who read the manual and act on it really exist! ;)

yes ;-) for my own safety. this instance is productive. If I would destroy I will be dead ;-)

About a year ago I have done some digging into shares (for my project https://github.com/GeoArchive) and need to brush up on it for my automation in creating an account in that platform (which uses NextCloud for the data).. but I haven't had the time for that yet.. in how much of a hurry are you?

I can't push you. I'm really grateful if you take care of this problem.

tomcatcw1980 commented 6 months ago

AD: If you can "correct" one share and EXACTLY tell me what you've changed I'll try to build the code migration code on that, idea?

You mean I should reshare an old share, that didn't work anymore? I think this isn't a good idea, because i'm in staging environment. I guess, then the user would be informed by mail that a new share on another instance is made. Can we avoid that?

mrAceT commented 6 months ago

If I understood you correctly you have done a test migration in a copied instance.

I need to (re)build my test setup to find out what needs to be changed. I was hoping you could tell me what needs to be changed (in the MySql database table(s) ). Based upon that I could extend the migration script.

But I will admit that'll require a bit of digging (trust me, I know ;) ) so I won't blame you if you leave that to me ;)

tomcatcw1980 commented 6 months ago

I have done the following:

1) I created an exact copy of the instance to be migrated. -- on the testmaschine: -- 2) Then I adjusted the variables in your script 3) Then I set it to Live "1" and let the script run. The fact that the script then terminated was probably due to the incorrect settings in my.cnf. I was able to fix this so that the script ran. 4) Since I ran everything via root, I had to manually adjust the folders to www-data via chown and then run files:scan --all and files:scan-app-data again.

I noticed the following thing, but I have no idea if it matters: The data that is downloaded from the S3 bucket apparently does not contain a file named .ocdata. I also had to copy this over from the old location to the new NC data folder. I was then able to log into the staging instance, can see all migrated data from S3. But unfortunately the shares no longer work, although they are apparently displayed.

I can delete everything on this instance again, restore the original DB of the old instance and turn everything back to the beginning. Since the S3 data is only copied, nothing is broken.

But I'm afraid I can contribute less here. I would therefore be grateful if you could somehow get to the bottom of the problem.

tomcatcw1980 commented 6 months ago

Hi, I did a bit of research again and made an extract of the oc_share table. The table contains the following data. In the production system the tables ist filled with data:

<html xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:x="urn:schemas-microsoft-com:office:excel" xmlns="http://www.w3.org/TR/REC-html40">

id | share_type | share_with | password | uid_owner | uid_initiator | parent | item_type | item_source | item_target | file_source | file_target | permissions | stime | accepted | expiration | token | mail_send | share_name | password_by_talk | note | hide_download | label | password_expiration_time | attributes -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | --

Remember: After running the script, shares were still displayed for the user and the folder was then empty if I enter. I then ran the occ sharing:delete-orphan-shares. After that, the oc_share table was empty and the folders were no longer displayed as shared for the user.

Now I have imported the backup.sql again. The table with oc_share is filled again. But the shares are not displayed for the user. Somehow we are overlooking something.

tomcatcw1980 commented 6 months ago

I have no peace about it. I have now performed the migration again. Immediately after the successful migration - the script ran cleanly to the end - I looked in the oc_share table: The data is there.

I have just noticed one more thing: When I click on the info box of a folder that is displayed as Share (by the owner), an error message appears when I switch to the Share tab:

"Selected keywords could not be loaded"

With the occ sharing:delete-orphan-shares I got this result:

image

I hope you find a solution somehow.

Thank you very much.

mrAceT commented 6 months ago

Since I ran everything via root, I had to manually adjust the folders to www-data via chown and then run files:scan --all and files:scan-app-data again.

That might be the root cause of your problem. The script already does a "files:scan --all" (line 397).. but since you need to do it again, I think line 397 actually breaks the migration, because the data isn't accessible by Nextcloud..

This because you are unable to run as the user "www-data" (which is an assumption on the creation of the migration script).

Try this: in the migration script add this after line 358:

echo "\nCopying files finished";

echo "\nSet the correct owner of the data folder..";
echo occ('','chown -R www-data:www-data '.$PATH_DATA);

(assuming the user and the group of the data folder is 'www-data')

I 'abuse' the function 'occ' I created to perform this owner swap. Then the owner is correct BEFORE we pull out of maintenance mode.. that might just do the trick.. I think/hope (this idea is untested)

Please try and let me know

[update] changed the location of the added lines (this spot is better)

mrAceT commented 6 months ago

@tomcatcw1980

I have updated the S3toLocal script to version 0.32

Someone else pointed out an other hiatus in the migration script (that that I copy from a part in localtoS3..)

I also added the option that I am assuming will help you in the migration..

Could you try?

tomcatcw1980 commented 6 months ago

@tomcatcw1980

I have updated the S3toLocal script to version 0.32

Someone else pointed out an other hiatus in the migration script (that that I copy from a part in localtoS3..)

I also added the option that I am assuming will help you in the migration..

Could you try?

Thank very much. I will try. Is there way not to copy the whole data from S3 each time? I have the 140GB that I copy every time.

PS: Hope you got the little motivation via PP.

tomcatcw1980 commented 6 months ago

I started one more time. Now I get a lot of error messages:

image

I changed the following settings:

image

I will inform you tomorrow.

Thanx.

mrAceT commented 6 months ago

I saw it, thanks! (added the option via Paypal now :)

Is there way not to copy the whole data from S3 each time?

RTM ;)

check line 25: $PATH_DATA_BKP = $PATH_BASE.'/data.bkp'; // Path of a previous migration.. to speed things up.. (manually move a previous migration here!!)

mrAceT commented 6 months ago

Yikes, that qualifies as an OOPS

Fixed the code, could you retry? (I am unable to test atm, my instance is (happily) running via S3)

I have taken a look at your image, the /nc_data (a root folder) is the location of you NexCloud files

the /var/nc_data is the location the script will look for data of a previous migration. If the data (size/date) is thge same it will use that file, then it will not download the file again, but will simply copy it from /var/nc_data to /nc_data

PS: you are able to run as 'sudo -u www-data' ? then the extra '$CLOUDUSER' value should not be needed..

tomcatcw1980 commented 6 months ago

Got this error now

image

my mariadb version: Server version: 10.11.6-MariaDB-0+deb12u1-log Debian 12

mrAceT commented 6 months ago

it must be the time.. typo => updated

tomcatcw1980 commented 6 months ago

Now i get this. But I dont understand. The path of the new Nextcloud data directory is set to:

$PATH_DATA = $PATH_BASE.'/nc_data';

there I copied the .ocdata file from the old instance. But it doesn't work. What am I doing wrong.

image

I'm going to call it a day for now. I won't be able to get back to it until tomorrow evening. Thanks for your help so far.

greetings and have nive evening.

mrAceT commented 6 months ago

This looks like a 'clouduser' thing.. it looks like you changed something that doesn't allow you to set maintenance mode to 'on'

(that sudo -u part?)

Do one or the other.. run as root and do a chown, or run as the clouduser (and do not chown).. by the looks of a previous image, you have a dedicated server running for Nextcloud, and the data is situated in a root folder (although I would suggest using '/var/nextcloud_data' as the data folder and '/var/nextcloud_data.bkp' as the backup/previous migration folder (I don't like to put stuff in root...)

PS: when you have $NON_EMPTY_TARGET_OK set to 1 you may get some files in your final setup that a user has deleted. That is why I created '$PATH_DATA_BKP', then it looks in that folder and you get a clean migration, check step 5 in the readme (you understand?)

tomcatcw1980 commented 6 months ago

Hi, two questions:

a) Shall I add a user named clouduser and make him sudo so that I can adopt your script without modifying anything?

b) Basic question that comes into my mind: My instance runs on S3 with primary storage configured. In my config.php exists a datadirectory that links to a local folder /var/nc_data. Does this variable exists independently if the instance has local data hosted or S3 as primary? Or do I have a relict in my config.php?

I ask this because in your script there are only variables for the new data directory and for the data directory if one have existing data from previous migrations from S3. Doesn't there have to be a variable for the data directory for the current installation?

mrAceT commented 6 months ago

Ah, then you need to use the existing one! /var/nc_data

I expect the 'chown' option I added will do the trick.

In your case I'dd try:

tomcatcw1980 commented 6 months ago

Hi,

I think it worked now. Thank you so much!

I will now do some tests and try it again so that I have a safe way to migrate. In another thread - I think of help.nextcloud.com - can't find anymore, I saw the hint, that the circles app may cause some broken links. So before migrating I diasabled this just to be safe.

I really appriciate your help.

Greetings Christian

mrAceT commented 6 months ago

Glad to have helped!

To speed up migration in your live setup, use the $PATH_DATA_BKP option!

tomcatcw1980 commented 6 months ago

One thing I forgot, that I did: I had trouble to login, told me something with invalid token. occ maintenance:data-fingerprint help then.

mrAceT commented 6 months ago

That is most likely because you used a copy of your live instance.. each instance must have its own "fingerprint'.