fluentsprings / ExpressMapper

Mapping .Net types
http://www.expressmapper.org
Other
309 stars 65 forks source link

Allow override name comparison strategy #98

Open sergiorykov opened 8 years ago

sergiorykov commented 8 years ago

There are two kinds of built in name comparison strategies: case sensitive/case insensitive.

Like that

string.Equals(x.Name, prop.Name, stringComparison)
string.Equals(destProp.Name, matchStart, _stringComparison)

 internal StringComparison GetStringCase()
        {
            StringComparison stringComparison;

            if (MappingServiceProvider.CaseSensetiveMemberMap && !CaseSensetiveOverride)
            {
                stringComparison = MappingServiceProvider.CaseSensetiveMemberMap
                    ? StringComparison.CurrentCulture
                    : StringComparison.OrdinalIgnoreCase;
            }
            else
            {
                stringComparison = CaseSensetiveMember
                    ? StringComparison.CurrentCulture
                    : StringComparison.OrdinalIgnoreCase;
            }
            return stringComparison;
        }

I suggest to extract it in separate class to allow to write custom strategy.

For example, I need to map generated wrappers for protobuf .proto files which is written in c style:

public class ProtoA 
{
      public string very_important_content {get;set;}
      public long timestamp {get;set;}
}

public class DomainA 
{
      public string VeryImportantContent {get;set;}
      public DateTimeOffset Timestamp {get;set;}
}

I will create strategy for mapper and attach it to mapper:

public class UnixNameComparer : INameComparer 
{
      public bool AreEqual(string sourceName, string destinationName) 
      {
              // to lower and remove '_' and compare ordingalignorecase
              return true;
      }
}

Mapper.NameComparer = new UnixNameComparer();

This code only illustrates my proposal. Feature is rather simple. We only need to add backward compability with CaseSensetiveMemberMap (mark it as obsolote and use to override default name comparison strategy).

sergiorykov commented 8 years ago

@anisimovyuriy @bogatykh What do you think? I see you don't have enough time to answer.

I can implement it in a couple of days. Actually I really need it :).

anisimovyuriy commented 8 years ago

Dear @sergiory,

My apologies I'm in a vacation right now. Thanks for your input and contribution! Definitely go for it, fork it and create a PR - I'll be able to review it in 3 days. Thanks a lot!

sergiorykov commented 8 years ago

Ok :)

sergiorykov commented 8 years ago

https://github.com/AutoMapper/AutoMapper/wiki/Configuration#naming-conventions

sergiorykov commented 8 years ago

It was a bit more complicated then I expected. Didn't have enough time on weekend.

I will be grateful if you will take a look at my recent commits: https://github.com/sergiorykov/ExpressMapper/commits/master

I've implemented original support case sensitive/insetive checks (as is) in CaseSensitiveMemberNameComparer It's used by default to support full backward compability (without extended functionality - as is).

Supporting name conventions required changes in FlattenMapper. You used concatenation of strings to go throuhg source nested class to create "possible" source member name prefix to compare with destProp.Name. And one more hidden "gem" is linq method support - It took time to trace algorithm.

I've wrote initial test for renewed implementation here FlattenCustomNamingComparingTests.FlattenFatherSonGrandsonDtoFromLowerCaseWithUnderscoresSourceOk.

API I've took from AutoMapper - they have damn configurable API - and make it compatible with ExpressMapper's API. Testing different name mapping configuration is pretty easy - we only need to test strategy with strings (without real types). Will add in a day or two.

It will have almost the same flexibility as AutoMapper - I need only add one more impl of IMemberNameNormalizationRule to support prefixes/postfixes like in https://github.com/AutoMapper/AutoMapper/blob/2c191860b3645e8139762ec00e6f0bbdadc14bd7/src/AutoMapper/Configuration/Conventions/PrePostfixName.cs

PS: All classes are gathered in on IMemberNameComparer.cs - didn't want to change other projects any time I adding new class (will refactor it at the end of the feature.)

JonPSmith commented 8 years ago

Hi @sergiorykov,

I was the person who added the Flatten code (with help from @anisimovyuriy) to ExpressMapper. I did look at your commits but I wasn't quite clear what you were trying to achieve. Have you written any documentation? I saw some Unit Tests, but it didn't look like they were finished.

I wasn't sure if you are trying to exactly match AutoMapper's flattening approach? If you are I'm not sure that is a good aim (see my point at the end). If you let me know what you are aiming at then I can look at your code and make any sensible comments.

To help you I have a detailed explanation of what Flatten does in this article on my site, with its rules and some comments on where it differs from AutoMapper. Hopefully reading that might help you understand what the code is doing. NOTE: I did write some short docs for Yuriy but he hasn't had time to add them to the http://expressmapper.org/ site yet.

I would make one point: My approach allowed for camelCase or PascalCase when ExpressMapper is in its default mode, i.e. non-case sensitive matching. That is useful as C# needs Pascal and JavaScript (json) needs camel. Having to set up different modes for each mapping depending on their use would be a pain for me.

Sergiory commented 8 years ago

Fellows. I'm the owner of the email sergiory@gmail.com. You guys have been sending all this discussion, that has nothing to do with me , to my email. Please check your email info in order to prevent this from happening.

Regards Em 16/06/2016 15:55, "Jon Smith" notifications@github.com escreveu:

Hi @sergiorykov https://github.com/sergiorykov,

I was the person who added the Flatten code (with help from @anisimovyuriy https://github.com/anisimovyuriy) to ExpressMapper. I did look at your commits but I wasn't quite clear what you were trying to achieve. Have you written any documentation? I saw some Unit Tests, but it didn't look like they were finished.

I wasn't sure if you are trying to exactly match AutoMapper's flattening approach? If you are I'm not sure that is a good aim (see my point at the end). If you let me know what you are aiming at then I can look at your code and make any sensible comments.

To help you I have a detailed explanation of what Flatten does in this article http://www.thereformedprogrammer.net/flattening-entity-framework-relationships-with-expressmapper/ on my site, with its rules and some comments on where it differs from AutoMapper. Hopefully reading that might help you understand what the code is doing. NOTE: I did write some short docs for Yuriy but he hasn't had time to add them to the http://expressmapper.org/ site yet.

I would make one point: My approach allowed for camelCase or PascalCase when ExpressMapper is in its default mode, i.e. non-case sensitive matching. That is useful as C# needs Pascal and json needs camel. Having to set up different modes for each mapping depending on their use would be a pain for me.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/fluentsprings/ExpressMapper/issues/98#issuecomment-226510203, or mute the thread https://github.com/notifications/unsubscribe/AMfsyDeDlcGzhQH7ETn1IMqQgS5qC95jks5qMWPQgaJpZM4IzCRF .

sergiorykov commented 8 years ago

@Sergiory It seems GitHub uses search by prefix :). I'm not sure how fix it. Citing was direct to me.

sergiorykov commented 8 years ago

@JonPSmith Thank you for your comments. I have not write any documentation in comments - it's just working prototype, proof of concept. I will definetly add comments to simplify usage.

I'm not trying to reimplement flattening in AutoMapper's style. They have one more level of abstraction for properties. In our case it's not possible at all (without complete rewriting of API).

Original scheme of detection names for source and destination members relies on case sensivity and assumption that flatten name for destination property consists from source property names.

source.son.grandson.myInt -> destination.sonGrandsonMyInt

Automapper allows schemes like that:

SourceNameConvention = LowerCaseUnderscores DestinationNameConvention = PascalCase source.son.grandson.my_int -> destination.SonGrandsonMyInt

SourceNameConvention = LowerCaseUnderscores DestinationNameConvention = PascalCase Source.AddPostfix("dto") Destination.AddPrefix("domain") source.son_dto.grandson_dto.my_int_dto -> destination.domain_SonGrandsonMyInt

They can do it because they can divide names on components (from both sides), normalize destination name to source name using name convention for source members.

I've done the same thing according ExpressMapper's API. I needed to do two things:

public interface IMemberNameComparer
    {
        bool Satisfies(string sourceMemberName, string destinationMemberName);
        bool Satisfies(IEnumerable<string> chainOfSourceMemberNames, string destinationMemberName);
        bool SatisfiesAsFlatten(IEnumerable<string> chainOfSourceMemberNames, string destinationMemberName);
        bool SatisfiesAsSelf(string memberName, string otherMemberName);
    }

It's full interface to implement any name comparison strategy (according to current design) Satisfies - compares equality of member names in both cases - direct mapping and flattening

source.son.grandson.my_int -> destination.SonGrandsonMyInt

Satisfies(new [] {"son", "grandson", "my_int"}, "SonGrandsonMyInt") == true
SatisfiesAsFlatten(new [] {"son", "grandson", "my_int"}, "SonGrandsonMyInt") == true
SatisfiesAsFlatten(new [] {"son", "grandson"}, "SonGrandsonMyInt") == true

last on SatisfiesAsSelf is used for filtering self properties (source - source, dest - dest) and ignored by user.

It's been done using two techniques - name conventions allows parse member names to parts and recompose it using style of opposite side.

source.son.grandson.my_int -> destination.SonGrandsonMyInt SourceNameConvention = LowerCaseUnderscores DestinationNameConvention = PascalCase

we will try to normalize source member names (using Rules) and we will get several possibilities for original name (AutoMapper does exactly the same thing under the cover).

the only rule that implemented now is AsIsMemberNameNormalizationRule - it returns name it self for PostFix rule Source.AddPostfix("_dto") it will try to cut from source member name that prefix and will increase number of prototypes of semantic member name (dto_son -> [son, dto_son]).

then we will try compose all possible source names using dest name convention and compare them with dest prop name:

[[son],[grandson],[my_int]] -> [[Son],[Grandson],[MyInt]] -> SonGrandsonMyInt == destination.SonGrandsonMyInt

for flatten it will compare using string.StartsWith, for direct mapping of full flatten mapping: string.Equals.

EF is just custom case:

SatisfiesAsFlatten(new [] {"son", "children", "EF suffix: Count" }, "SonChildrenCount") == true

var enumerableMethod = FlattenLinqMethod.EnumerableEndMatchsWithLinqMethod(sourcePropPath.Select(p => p.Name), destProp.Name, _nameComparer);

return EnumerableMethodLookup.SingleOrDefault(x =>
            {
                var chainOfSourcePropsWithLinqSuffix = chainOfSourcePropNames.ToList();
                chainOfSourcePropsWithLinqSuffix.Add(x._methodName);
                return comparer.Satisfies(chainOfSourcePropsWithLinqSuffix, destinationName);
            });
JonPSmith commented 8 years ago

Hi sergiorykov,

Thank you for a very detailed description of what you are trying to do, which is a more comprehensive matching approach. I'm sure that you could implement something like that, but I have questions as to whether it is worthwhile. Here are my thoughts:

  1. My main concern is performance. AutoMapper is really powerful, but has a very high 'set up map' time. I timed ExpressMapper with Flattening 'set up map' against AutoMapper and ExpressMapper is approx eight time faster. This matters to me as my library, GenericServices tries to hide the setting up of the mapping, which therefore occurs on first real use.

    Clearly you can implement a design that minimises the mapping set up costs of your scheme for the default, non-case sensitive, condition. I would recommend doing some trials on any implementation as keeping the speed of ExpressMapper is part of the design goal. Here is a link to my TryMappers repro which I used for performance testing in case its useful.

  2. A much more minor point, and one I might be biased on as I don't know your use cases, but adding your code could make ExpressMapper more complicated. One of the attractions of ExpressMapper to me is it is fairly simple to understand. I tried to honour that simplicity when I did my flattening extension (@anisimovyuriy helped a lot on that too) by making flattening an add-on that people can ignore. I would suggest you try to do the same with any feature you add, i.e. leave case/non-case sensitive as the normal controls, with your new approach as an add-on.

Having dealt with the design issues I did want to point out one small design aspect of my flattening code that you need to be aware of.

The FlattenMapper class specifically excludes exact matches of names (see line 25 of the constructor in FlattenMapper.cs), as ExpressMapper picks them up. This means the mapping of those direct maps are handled outside the FlattenMapper class. I don't think you have included code to handle that (I think?).

sergiorykov commented 8 years ago

Thanks for clarifications!

I gave a try to ExpressMapper mainly because of proposed incredible performance. In case of mapping objects performance per se obviously will be the same as original code. Cost of 'set up map' will increase. I will run your tests from TryMappers too and will say results of it with default/my impl. Default implementation will use default name comparer wich works internally the same way as original code (it calls string.Equals/StartsWith).

I use AutoMapper in almost all my projects and TinyMapper for restricted environments. I actually use only minority of features AutoMapper supports and name configuration belongs to it. That is why I'm interested in framework that will be much faster and satisfies my requirements.

From one side cost of 'set up map' is one time cost, from the other side adding new features will make ExpressMapper much more complicated. I agree.

TinyMapper was created with exactly the same reasons in mind. It's super-duper-happy-path of using is simplified direct mappings (dto-domain, domain-entity).

My DTO's uses protobuf with linux style namings. I think customizing name comparing rules will cover that happy path.

In the long term performant and functional framework will win. AutoMapper 5.0 10X increased it's speed, but it's still much slower then ExpressMapper.

And one last thing - increased complexity of API. To eliminate it we should create an API which will be as much known by end users as possible. That is why I choose name comparing configuration in the same way as AutoMapper. I don't want to learn new API I want to find it where it supposed to be.

PS: line 25 FlattenMapper I took into account. Original mapping tests are OK. I didn't find special test for it. Under debugger I've seen that those properties are really ignored.

anisimovyuriy commented 8 years ago

Dear All,

Thanks a lot for such details and input on the matter!

  1. @JonPSmith I think performance won't be affected substantially as it happens during compile time. Another thing "flattening" is not turned on by default. What's your opinion?
  2. @JonPSmith agree about simplicity but we have to keep in mind that by default it is simple, but if user would like to have some advanced features - I think it is a plus but we need to implement it in a smart way to keep it simpler. Please provide your thoughts!

Anyway @sergiorykov - I'm very grateful what you've done already! I'm going to check your changes and give you my remarks.

Thanks a lot guys for your contribution - very appreciated!

Gratefully Yours, Yuriy

sergiorykov commented 8 years ago

Will wait your response. I don't think that I will have free time during this weekend. Functionally, It's API is almost done. Normalization rules may be a bit hard to understand but they should be compared to similar API of AutoMapper.

JonPSmith commented 8 years ago

Thanks for your helpful comments @anisimovyuriy. Very wise. sergiorykov - go for it.

sergiorykov commented 8 years ago

@JonPSmith I took a look at TryMappers. EF tests failed I don't have required DB. What about perf tests - TimerToConsole is elegant but for micro level ~1ms it requires more precise counters. I recommend BenchmarkDotNet.

I don't understand what SUT is.

using (new TimerToConsole())
{
    config = new AutoMapper.MapperConfiguration(cfg => cfg.CreateMap<Father, GenerationFlattenDto>());
}
using (new TimerToConsole())
{
    for (int i = 0; i < numTimes; i++)
    {
        //ATTEMPT
        var mapper = config.CreateMapper();
        var dto = mapper.Map<GenerationFlattenDto>(Father.CreateOne());

        //VERIFY
        dto.MyString.ShouldEqual("Father");
        dto.SonMyString.ShouldEqual("Son");
        dto.SonGrandsonMyString.ShouldEqual("Grandson");
    }
}

At least we should separate configuration/compilation part from execution part. In execution part I don't want to measure perf of Assert functions.

Please, specify concrete test(s) you want to see.

Algorithm has itself O(N^2) or even O(N^3) complexity (it's even without optimizations) , but in real case N == number of properties ~ 10-20. And so complex rules may be required for 10-20 classes out of 200-300 mappings in average project.

I can write test for algorithm (it's the only point where my impl really can affect overall perf) along with unit tests.

Instead, I think we should take a look at resulting API to simplify and minimize it. We took too much time to thirdparty questions. Can you provide comments about API?

PS: sorry, if not all the things I can express politely - I'm trying my best. :)

JonPSmith commented 8 years ago

Hi sergiorykov,

I'm going to answer your questions with the most important things you asked first.

1. Comments on your suggested API

I saw your suggestion of a separate Source and Destination mapping approach (like AutoMapper). That sounds fine.

I would suggest that your IMemberNameComparer should default to using the existing string comparison approach. That way any existing code will continue to work.

Other than that you can make your API have all sorts of feature, but I do suggest you document them with examples if you want others to use them.

2. Performance of your algorithm

As you say your algorithm is only called in the set up of the Mapping. As the author of ExpressMapper (@anisimovyuriy) says most people don't care much about that, so performance of your algorithm doesn't matter that much.

I should say that do I care, but that is because of the way my library GenericServices works, but that is specific to me. So, don't worry too much about performance.

3. My old TryMapper code (ignore it)

I only suggested TryMappers as it might give you a start. I recommend you use your own approach for testing performance.

NOTE: You has a problem with running TryMapper because of the database. It might be that you need to manually run the Unit Test called WipeCreateDatabase which is in Test10CompareMappersEntityFramework , line 37. This creates a new database.

anisimovyuriy commented 8 years ago

Brilliant job @sergiorykov - very appreciated your contribution! I left some remarks inside of your commits.

Thanks, Yuriy

sergiorykov commented 8 years ago

@JonPSmith

1) Absolutely agree. CaseSensitiveMemberNameComparer implements exactly original approach to be fully backward compatible and will be used when don't override strategy. Documentation in case of this feature is required. It will be written as unit tests first and then will go to wiki/use cases in readme.

sergiorykov commented 8 years ago

@anisimovyuriy

Thanks :)

I've seen your recent commits. The part in ProcessAutoProperties is a bit tricky. Searching in CustomPropertyCache and IgnoreMemberList should be implemented via MemberNameComparer.Satisfies. There are several places when you are searching in those collections (without latest addition of string comparison) and it may lead to possible bugs in other scenarios.

Are there any more places where you compare names from source and destination? Actually, I've missed those collections.

anisimovyuriy commented 8 years ago

Sure it should be implemented with your changes - completely agree. That was just a fix for current version. Let me check if there are multiple places as it should be only one unified method and I'll get back to you soon. Thanks, Yuriy