DataObjects-NET / dataobjects-net

https://dataobjects.net
MIT License
60 stars 23 forks source link

How to copy data of field to another field and remove first one. #373

Closed AndrewLityagin closed 4 months ago

AndrewLityagin commented 5 months ago

Hello!

I had (for example) Order type with field Customer.ID (Type1). I added new field Client.ID (Type 2) into Order type. I need to copy all identificators from field Customer.ID to Client.ID and remove Customer.ID field. In database fields should have same type (long). Client.ID is absent in data base before upgrade. Which upgrade hint I should use in upgrade script?

I tried to use:

Thanks in advance!

alex-kulakov commented 5 months ago

Hello Andrew,

This is a complicated case which requires more work than usual.

CopyFieldHint/MoveFieldHint hints are for copying/moving field within hierarchy, it cannot change type of field, even simple types. RenameFieldHint handles change of name, when field remains its type untouched. It's not suitable for the job.

There is a ChangeFieldTypeHint can help in certain cases, but unfortunately, not in yours. The thing is that you change type of Field from one entity type to another. Entity is a complicated type though you can't see it, each entity is identified by Key fields (ID in your case) and also a unique type identifier, so, even if IDs of Customer and Client matched in references for some reason, logically (1, Custormer) and (1, Client) are not equal, and DataObjects.Net understands this difference. If you apply this type of hint to a field that changed its entity type, DataObjects.Net will delete data for the field, because logically rows that exist in database became irrelevant to new type.

So, what you have left with is manual data migration.

First of all, we encourage people to make soft upgrades with using of built-in mechanism like [Recycled] attribute, so old field softly fade out of usage. Using your terminology and assuming that Type1 is Customer and Type2 is Client, model to upgrade schema in database may look like:

[HierarchyRoot]
public class Order : Entity
{
  [Field, Key]
  public long ID { get; private set; }

  [Field, Recycled, Obsolete("Use Client instead")] // obsolete is to hint people to use new variant
  public Customer Customer {get; set;} // also it will be nice to change setter to private to prevent any changes

  [Field]
  public Client Client { get; set; }
}

RecycledAttribute makes field and data be presented and accessible in UpgradeHandler.OnUpgrade() method but not in final Domain model. You will be able to query data and make data migration. In some cases, it might be possible to use our Xtensive.Orm.BulkOperation extension but it has one very important limitation - it cannot work with hierarchies containing more than one entity.

If batches of SQL queries are supported by storage provider (e.g. sqlserver provider), it will be useful to increase SessionConfuguration.BatchSize from default 25 to something bigger for System session, because data manipulation will be intensive. You just need to be aware of number of parameters limitation for your RDBMS.

If usage of plain SQL is OK for you and you are not very satisfied by the speed of migration through entities, then SQL may be used to migrate data faster, especially, if only thing that required is copy value row by row. We provide a Session instance for OnUpgrade() method so you can use session services that provide low-level capabilities, including access to DbCommands. Like so,

      public override void OnUpgrade()
      {
        var sqlAccessor = UpgradeContext.Session.Services.Get<Services.DirectSqlAccessor>();

        using (var command = sqlAccessor.CreateCommand()) {
          command.CommandText = "<your text for command>";
          _ = command.ExecuteNonQuery();
        }
      }

But be aware of that sometimes migrations can require some maps in memory.

If you prefer hard way like "delete field and then deal with consequences" then it will require you to be creative and use SQL to some extend, especially if you have a lot of data and you need to be fast and memory efficient.

If you do such complicated upgrades with some regularity. I can suggest an idea.

The idea is to have a general-purpose ignore rule in DomainConfiguration for upgrade cases when type of Field changes from one Entity type to another and there is an opportunity to just copy old data to new column row by row. This rule will give a chance to have a SQL table level map instead of Dictionary in memory so operation will still be fast.

So, the steps:

First, Add a general ignore rule to DomainConfiguration.IgnoreRules collection, like so

   domainConfiguration.IgnoreRules.IgnoreColumn("TMP_FORUPGRADE_*");

The prefix may me different, but it should be quite specific to eliminate any chances of conflict with meaningful names in model. The prefix will have effect across all databases and schemas which are included in Domain so such rule can live forever as part of configuration or be included only when it is upgrade time by some condition you define in code, it is up to you.

It is also possible to make more specific rules, restricting scope of rule by table, schema and database

  domainConfiguration.IgnoreRules.IgnoreColumn("TMP_FORUPGRADE_*")
      .WhenTable("SomeTable")
      .WhenSchema("dbo")
      .WhenDatabase("DO-Test");

Then, before any schema changes appear (in OnPrepare or OnBeforeExecuteActions), create a new column in the table with declared prefix by ALTER TABLE, or in some cases rename existing column to have the prefix. In case of newly created column copy data with UPDATE ... SET [TMP_FORUPGRADE_Client.ID] = [Customer.ID]. The catch is that on the stages of domain building before OnStage Session instance is not provided, so you will have to create DbCommand some different way. Next, allow DataObjects.Net to drop Customer.ID by creation of RemoveFieldHint, Client.ID will also be created without any issues. After that, in OnUpgrade() you will be able to repeat UPDATE statement like UPDATE ... SET [Client.Id] = [TMP_FORUPGRADE_Client.ID] or UPDATE ... SET [Client.Id] = [TMP_FORUPGRADE_Customer.ID], depending on whether you preferred new column creation. Finally, delete temporary column from database, for example in OnComplete stage of UpgradeHandler, when Domain is upgraded and ready to be delivered to you. At this point you can open session and transaction and use built-in service.

I believe that there is more than one solution for this case, the described above is what comes to my mind right now.

I hope this will help.