owncloud / core

:cloud: ownCloud web server core (Files, DAV, etc.)
https://owncloud.com
GNU Affero General Public License v3.0
8.34k stars 2.05k forks source link

OC7: Problems converting from SQLite3 to PostgresQL #9891

Closed jmaerki closed 8 years ago

jmaerki commented 10 years ago

(Move here from http://forum.owncloud.org/viewtopic.php?f=29&t=22446)

I've just upgraded to OwnCloud 7 and the upgrade went well. It's currently running on SQLite and I would like to migrate to PostgresQL since I've run into concurrency problems due to the usage of multiple devices.

Steps to reproduce

So, after finding https://github.com/owncloud/core/issues/9526 I adjusted the call to the following: sudo php occ db:convert-type --password="wonttell" --all-apps pgsql owncloud localhost owncloud

Expected behaviour

Well, just a regular migration from SQLite to PostgresQL.

Actual behaviour

The first issue was:

    oc_filecache
      8810/20428 [============>---------------]  43%

     [Doctrine\DBAL\DBALException]
      An exception occurred while executing 'INSERT INTO oc_filecache ("fileid", "storage", "parent", "name", "mimetype", "mimepart", "mtime", "encrypted", "path
      _hash", "etag", "storage_mtime", "path", "size", "unencrypted_size", "permissions") VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)' with params ["209
      08", "7", "20905", "\x31\x35 [..] \x6d\x70\x33", "11", "10", "1295508789", "0", "[..]", "[..]", "1295508789", "\x4d
      \x75\x73\x69\ [..] \xb4\x74 [..]", "3903543", "0", "0"]:

      SQLSTATE[22021]: Character not in repertoire: 7 ERROR:  invalid byte sequence for encoding "UTF8": 0xb4

      [PDOException]
      SQLSTATE[22021]: Character not in repertoire: 7 ERROR:  invalid byte sequence for encoding "UTF8": 0xb4

I've replaced some content with [..] above to obfuscate the actual data. Anyway, the Postgres database is set up as described in the manual. It says "UTF-8" as encoding, but I suspect this is coming from the reading side somehow.

I read somewhere that it is possible to reset the file cache, so I ran "sqlite3 owncloud.db" and "DELETE FROM oc_filecache;" and that actually brought me past this first issue.

But then I ran into the following:

    oc_contacts_cards_properties
      365/1516 [======>---------------------]  24%

      [Doctrine\DBAL\DBALException]
      An exception occurred while executing 'INSERT INTO oc_contacts_cards_properties ("id", "name", "value", "preferred", "userid", "contactid") VALUES (?, ?, ?, ?, ?, ?)' with params ["486", "PHOTO", "\/9j\/4AAQS [..] 8A\/9k=", "0", "[..]", "73"]:

      SQLSTATE[22001]: String data, right truncated: 7 ERROR:  value too long for type character varying(255)

      [PDOException]
      SQLSTATE[22001]: String data, right truncated: 7 ERROR:  value too long for type character varying(255)

"value" is quite big, surely more than 255 characters. Given that the parameter is called "PHOTO" I assume a base64-encoded JPEG image associated with a contact. But on the PostgresQL table created the column "value" is just "value character varying(255) DEFAULT NULL::character varying" which explains the problem. Should this maybe be a "text" or blob?

Server configuration

Operating system: Linux 3.10.25+ armv6l (RaspberryPi)

Web server: Apache HTTPD 2.2.22

Database: SQLite3 3.7.13, PostgresQL 9.1.13

PHP version: 5.4.4-14+deb7u12

ownCloud version: 7.0.0

Updated from an older ownCloud or fresh install: Updated from latest OC 6.x

List of activated apps: Standard apps + Mozilla Sync

The content of config/config.php:

<?php
$CONFIG = array (
  'instanceid' => '51bc88cb8abfa',
  'passwordsalt' => '*** Removed for security reasons',
  'datadirectory' => '/var/www/owncloud/data',
  'dbtype' => 'sqlite3',
  'version' => '7.0.0.8',
  'installed' => true,
  'loglevel' => '1',
  'forcessl' => true,
  'mail_smtpmode' => 'smtp',
  'mail_smtpdebug' => true,
  'mail_smtphost' => 'mail.wonttell',
  'mail_smtpauth' => true,
  'mail_smtpsecure' => 'tls',
  'mail_smtpname' => 'wonttell',
  'mail_smtppassword' => '*** Removed for security reasons',
  'theme' => '',
  'maintenance' => false,
  'appstoreenabled' => true,
  'appstoreurl' => 'http://api.apps.owncloud.com/v1',
  'trusted_domains' =>
  array (
    0 => 'wonttell.domain',
  ),
);

Are you using external storage, if yes which one: local (external USB drive)

Are you using encryption: yes

Client configuration

n/a here

Logs

Web server error log

n/a here

ownCloud log (data/owncloud.log)

I've got quite a few of these in the logs even before the upgrade to OC7 which could be relevant:

{"app":"core","message":null,"level":2,"time":"2014-07-24T08:29:30+00:00"}
{"app":"PHP","message":"json_encode(): Invalid UTF-8 sequence in argument at \/mnt\/usbdrive\/www\/owncloud\/lib\/private\/log\/owncloud.php#86","level":3,"time":"2014-07-24T08:29:30+00:00"}

Otherwise, I don't see anything in the logs that could be applicable to this problem.

Browser log

n/a here

PVince81 commented 10 years ago

@bantu

bantu commented 10 years ago

@PVince81 Possible inconsistency in schemas across DBMSes. @DeepDiver1975 might be able to help.

DavidS commented 10 years ago

I've hit that problem too. Seems like sqlite is somehow (perhaps due to migrations?) not checking the length of varchar fields:

sqlite> .schema oc_contacts_cards_properties
CREATE TABLE oc_contacts_cards_properties ("id" INTEGER NOT NULL, name VARCHAR(64) DEFAULT NULL, value VARCHAR(255) DEFAULT NULL, preferred INTEGER DEFAULT 1 NOT NULL, userid VARCHAR(255) DEFAULT '' NOT NULL, "contactid" INTEGER DEFAULT 0 NOT NULL, PRIMARY KEY("id"));
CREATE INDEX cp_contactid_index ON oc_contacts_cards_properties ("contactid");
CREATE INDEX cp_name_index ON oc_contacts_cards_properties (name);
CREATE INDEX cp_value_index ON oc_contacts_cards_properties (value);
sqlite> select max(length(value)) from oc_contacts_cards_properties;
146704
sqlite> 

I've fixed this by changing the database.xml for oc_contacts_cards_properties.value to clob.

oc_bookmarks' title was also hit by this. I "fixed" it likewise.

jpschewe commented 10 years ago

I've got the same problem too with the bookmarks.

  [Doctrine\DBAL\DBALException]                                                
  An exception occurred while executing 'INSERT INTO oc_bookmarks ("id", "pub  
  lic", "url", "title", "user_id", "description", "added", "lastmodified", "c  
  lickcount") VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?)' with params ["84", "0", "ht  
  tp:\/\/code.google.com\/p\/jmockit\/wiki\/MockingToolkitComparisonMatrix",   
  "MockingToolkitComparisonMatrix - jmockit - A feature matrix comparing seve  
  ral mocking toolkits. - A capable and elegant developer testing toolkit for  
   Java - Google Project Hosting", "jpschewe", "", "1376360150", "1376360150"  
  , "0"]:                                                                      

  SQLSTATE[22001]: String data, right truncated: 7 ERROR:  value too long for  
   type character varying(140)                                                 

I fixed this by editing the sqlite database before the conversion. I only had 1 bookmark that was a problem.

jasonhoekstra commented 10 years ago

Same for errors on oc_jobs and oc_activities, the fix was expand the character fields from 255 to 1024 in either db_structure.xml or appinfo/database.xml depending on the table.

Looks like the desired field lengths are overflowing the definition in the database.xml files, which SQLite doesn't enforce a length limit on varchar fields. As per SQLite documentation, "Note that numeric arguments in parentheses that following the type name (ex: "VARCHAR(255)") are ignored by SQLite - SQLite does not impose any length restrictions (other than the large global SQLITE_MAX_LENGTH limit) on the length of strings" (http://www.sqlite.org/datatype3.html)

RealCrazyBird commented 10 years ago

I have the same issue, but I had PostgresQL before, just tried to upgrade from 6.0.x to 7.0.2...

Does anyone of the developers test migrations before the final release of a new owncloud- version using other DBMS dann sqllite/mysql?

Jef2i commented 9 years ago

Same issue. I've tried @DavidS' workaround I understood being a modification of the file /apps/contacts/appinfo/database.xml . Now I've got another error message

 [Doctrine\DBAL\DBALException]                                                                                                                              
  An exception occurred while executing 'INSERT INTO oc_contacts_cards_properties ("id", "name", "value", "preferred", "userid", "contactid") VALUES (?, ?,  
   ?, ?, ?, ?)' with params ["55", "PHOTO", "<TRUNCATED VALUE>", "0", "<USERNAME>"  
  , "12"]:                                                                                                                                                   

  SQLSTATE[54000]: Program limit exceeded: 7 ERROR:  index row size 6256 exceeds maximum 2712 for index "cp_value_index"                                     
  HINT:  Values larger than 1/3 of a buffer page cannot be indexed.                                                                                          
  Consider a function index of an MD5 hash of the value, or use full text indexing.

If someone has an understanding key to help me decrypting this error message it'll be appreciated. It's my first use of PostgreSQL

Coming back to the initial issue. Previously mentionned workaround applied, I noticed that PgSql side created table oc_contacts_cards_properties the field value is still of "TEXT" type. Is it a correct read of the "converttype.php" tool if I say that the migration uses the schema of the source's DB (sqlite3 here) to create the destination structure ? Is the database.xml apps' desctiption used during migration ? I could think so regarding my new error message. But I can't understand when.

All of these brings me some wonderings. First : Still being into workarounds, souldn't the best way to migrate be

Second : any side effect for future OC updates/upgrade ??

Jef2i commented 9 years ago

Well, didn't find how to get rid of my previous error messsage (index limitation). May be was it due to a pgsql process parameter I prefer not to tune. So I finally found a way to complete a migration.

Got another migration error with oc_bookmarks (deleted already created tables)

Migration process went till the end \o/

Finally \ manually created a partial index CREATE INDEX cp_value_index ON oc_contacts_cards_properties (value) WHERE length(value) <= 255;

Everything's running for now.

Your comments are welcome concerning the choice of TEXT vs CLOB (if it's the right place for it)

jmaerki commented 9 years ago

Thanks for all the feedback. That prompted me to try again and it actually helped me do the migration in the end even if there was quite a bit of manual editing of database.xml was required. Hopefully not with any downsides with future upgrades. But....I want to make everyone aware that I had one major downside to it all and I don't know if it's a problem of the migration or the database design to begin with: After I migrated, the shares (by link) I had were suddenly pointing to random files that I've never shared with anyone. And that's really, really bad. It feels like the shares rely on an ID allocated by the database and when you migrate the IDs might be different but the link still uses the same number. I haven't checked the actual implementation and I realize that this is probably a separate issue from this ticket but it has clearly a connection to database migration.

PVince81 commented 9 years ago

The database does uses file ids for shares and in the future "favorites".

If the conversion code isn't able to keep file ids then this is a major bug!

@bantu can you comment on this ?

DeepDiver1975 commented 9 years ago

If the conversion code isn't able to keep file ids then this is a major bug!

well - because we use autoincrement/sequences this will be really interesting to solve - we did not think about this .... tooo bad

DeepDiver1975 commented 9 years ago

@bantu please have a look at this as soon as time permits - THX

DeepDiver1975 commented 9 years ago

scheduling for 8.0 for at least analysis - let's see if we can come up with a solution .....

bantu commented 9 years ago

If the conversion code isn't able to keep file ids then this is a major bug!

well - because we use autoincrement/sequences this will be really interesting to solve - we did not think about this .... tooo bad

From reading the code (https://github.com/owncloud/core/blob/master/core/command/db/converttype.php#L242), all data is copied including autoincrement / sequence values. For PostgreSQL we additionally resynchronise sequences to their expected value (https://github.com/owncloud/core/blob/master/core/command/db/converttype.php#L271 https://github.com/owncloud/core/blob/master/lib/private/db/pgsqltools.php#L50)

DeepDiver1975 commented 9 years ago

moving to 8.1 - time is up for 8.0

bantu commented 9 years ago

@DeepDiver1975 Not sure what to do with this further.

Based on https://github.com/owncloud/core/issues/9891#issuecomment-69736128 I don't think an accidental change of primary key columns as suggested by https://github.com/owncloud/core/issues/9891#issuecomment-69718387 is possible.

In my opinion, the root cause of most migration issues is insufficient validation of input data (in the first place). E.g., if you're writing into a varchar 100 column, validate that the data actually is 100 characters or less before attempting to write into the DB. (There has to be a proper error message saying that the data is too long anyway; instead of a SQL error or another generic message.)

The above combined with the fact that SQLite accepts arbitrary length data (as per https://github.com/owncloud/core/issues/9891#issuecomment-52438670) leads migrations to fail with "data too long" errors. Since the data is already in the database, users will have to cleanup their DBs before migration manually in any case.

bantu commented 9 years ago

On second thought:

  1. The "After I migrated, the shares (by link) I had were suddenly pointing to random files that I've never shared with anyone." behaviour described in https://github.com/owncloud/core/issues/9891#issuecomment-69718387 can possibly be explain by the "DELETE FROM oc_filecache" query described in the ticket description in combination with SQLite's Truncate Optimization (https://www.sqlite.org/lang_delete.html): No WHERE clause --> DELETE actually is TRUNCATE --> Reset of auto increment counters -> reuse of ids -> shares pointing to different files.
  2. The initial SQL error "SQLSTATE[22021]: Character not in repertoire: 7 ERROR: invalid byte sequence for encoding "UTF8": 0xb4" remains to be investigated. So let's make this ticket about that.
bantu commented 9 years ago

On the other hand, my sqlite does not seem to reproduce.

SQLite version 3.8.6 2014-08-15 11:46:33
Enter ".help" for usage hints.
sqlite> .tables
sqlite> create table tbl1(ID INTEGER PRIMARY KEY AUTOINCREMENT, col2 VARCHAR(255));
sqlite> insert into tbl1 (col2) VALUES ('foo');
sqlite> insert into tbl1 (col2) VALUES ('bar');
sqlite> select * from tbl1;
1|foo
2|bar
sqlite> delete from tbl1;
sqlite> insert into tbl1 (col2) VALUES ('barz');
sqlite> select * from tbl1;
3|barz
DeepDiver1975 commented 8 years ago

@karlitschek @cmonteroluque this is the famous 3byte vs 4byte mysql char set issue - not really a solution at hand - suggest to move this to the backlog

karlitschek commented 8 years ago

Yes unfortunately. This needs a lot more thinking later.

ghost commented 8 years ago

Nod.

nickvergessen commented 8 years ago

The column in the new dav app is clob, so should be fine? @DeepDiver1975 @PVince81

DeepDiver1975 commented 8 years ago

The column in the new dav app is clob, so should be fine?

yes - woth exception of the mysql utf8 issue - but this a total different topic.

For all the other issues I don't see much we can do -> closing

lock[bot] commented 5 years ago

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.