vkhorikov / CSharpFunctionalExtensions

Functional extensions for C#

MIT License

2.43k stars 303 forks source link

Soc to minimize transient dependencies #435

Open ProphetLamb opened 2 years ago

ProphetLamb commented 2 years ago

Synopsis

Apply soc to minimize transient dependencies when importing CSharpFunctionalExtensions by enforcing structure and separating unrelated and independent functionality into different projects.

This is what I plan to do for #hacktoberfest.

Implementation

So let's go with the proposal, implement poly-repository structure and let's see the actors:

CSharpFunctionalExtensions
|- Abstractions: Interfaces
|- Maybe
|  |- Core: The core functionality 
|  |- Extensions
|     |- Common: Common extensions methods
|     |- Collections: Extensions related to operations on collections
|     |- Threading.Tasks: The async Task stuff
|     |- Threading.ValueTask: The async ValueTask stuff
|     |- Nullable: The System.Nullable interop
|- Result
|  |- Core
|  |- Extensions
|     |- [...]
|- ValueObject
|  |- Core
|  |- Extensions
|     |- [...]
|- Entity
|  |- Core
|  |- Extensions
|     |- [...]

Reasoning

In many projects, our team doesn't need even a fraction of the functionality, for example we don't use ValueObject and Entity at all, yet these types are still imported, it would be great to opt out of them.

For the second reason, let me illustrate the following example the Core and Extensions.General functionality of Maybe doesn't have any dependencies, which is great. Now we have a CLI app and we include said packages and don't have any transitional dependencies. Yet if we currently import CSharpFunctionalExtensions we must include System.Threading.Tasks.dll for a successful load. This is not problematic for most, but when registering an assembly as ocx for com interop all referenced assemblies have to be registered.

For the third reason assume CSharpFunctionalExtensions wishes to support integration into another 3rd party library, this absolutely required a separate project, because otherwise another 3rd party library needs to be loaded. Enforcing a repository structure lays the groundwork for such support.

Problems

Initially you may think, that existing projects would have to change anything, but this is not true, the project CSharpFunctionalExtensions simply references all subsidiary projects and is published as is.

The root namespace for subsidiary projects is by default hierarchical, this has to be overwritten <RootNamespace>CSharpFunctionalExtensions<RootNamespace>.

So in the end this doesn't introduce any breaking changes.

Examples

Most big projects use a poly-repository structure. See Serilog, Hosting, Logging, Uno etc.

Death to monoliths!

ProphetLamb commented 2 years ago

Exporting a hierarchical project structure would be sensible aswell. CSharpFunctionalExtensions.Maybe.Core

#if NAMESPACE_HEIRACHICAL
namespace Functional.Maybe;
#else
namespace CSharpFunctionalExtensions;
#endif

With separate project files where one defines

<Product>Functional.Maybe<Product/>
<DefineConstants>$DefineConstants;NAMESPACE_HEIRACHICAL<DefineConstants/>

And the other is as is. This would publish the poly-repository separately from CSharpFunctionalExtensions.

Tho I do not know if this is desirable. What is your opinion @vkhorikov ?

vkhorikov commented 2 years ago

Thanks for this proposal. I've been thinking about separating the NuGet package into separate smaller packages as well (while leaving the overarching CSharpFunctionalExtensions as-is for compatibility).

A couple of comments/questions:

In the hierarchy that you shared, do the top-most nodes represent separate projects and NuGet packages, while the nodes underneath them represent separate namespaces? For example here:

CSharpFunctionalExtensions
|- Maybe
|  |- Core: The core functionality 
|  |- Extensions
|     |- Common: Common extensions methods
|     |- Collections: Extensions related to operations on collections
|     |- Threading.Tasks: The async Task stuff
|     |- Threading.ValueTask: The async ValueTask stuff
|     |- Nullable: The System.Nullable interop

Maybe would be a project and a NuGet package, and Maybe.Core, Maybe.Extensions.Common, Maybe.Extensions.Collections, etc are namespaces inside that project, correct?

I don't think we need a separate Abstractions project/package. Unless you have a specific use case in mind that we can discuss, I suggest putting abstractions to their respective packages. So, 4 projects/packages in total:
- Maybe
- Result
- ValueObject
- Entity
We shouldn't change the package names (or namespaces) to Functional, that would lead to confusion among users. It should be CSharpFunctionalExtensions.Maybe, not Functional.Maybe.
I don't think namespaces should be that detailed. CSharpFunctionalExtensions.Result is enough for all Result-related functionality, no need for CSharpFunctionalExtensions.Result.Extensions, etc.
There is functionality in Maybe that refers to Result, e.g the conversion from Maybe to Result (ToResult()). How can this be handled without referencing Result in Maybe?

ProphetLamb commented 2 years ago

Thanks for your quick answer :)

1. Namespaces

The hierarchy reflects the Nuget packages, where every node represents a package. Nodes with description have actual functionality. Nodes without description are aggregations of subordinate nodes. Whether a project should have a namespace or not is not represented. An aggregation package simply references the direct subordinate project's then is published as is. This can be achieved using nuspec and targets or simply a csproj:

<Product>CSharpFunctionalExtensions</Product>

<ProjectReference Include="../Maybe" />
<ProjectReference Include="../Result" />
<ProjectReference Include="../ValueObject" />
<ProjectReference Include="../Entity" />

Including the resulting Nuget package CSharpFunctionalExtensions allows access to all functionality of subordinate projects.

The paradigms of poly-repos are:

No project should have more than one publicly visible namespace, except for internal helpers.
Structure namespaces inside the solution using projects, instead of nested directories.
A namespace should represent a single purpose.

Using the package hierarchy will result in the following list of packages for Maybe:

CSharpFunctionalExtensions.Maybe: All Maybe related functionality
CSharpFunctionalExtensions.Maybe.Core: Core types and interfaces.
CSharpFunctionalExtensions.Maybe.Extensions: All extension methods.
CSharpFunctionalExtensions.Maybe.Extensions.Common: Common extensions methods
CSharpFunctionalExtensions.Maybe.Extensions.Collections: Extensions related to operations on collections
CSharpFunctionalExtensions.Maybe.Extensions.Threading.Task: The async Task stuff

The directed dependency graph is described below:

1->2
1->3

3->4
3->5
3->6

4->2

5->2
5->4

6->2
6->4

This allows a user to include

<PackageReference Include="CSharpFunctionalExtensions.Maybe" />

For all Maybe related functionality.

<PackageReference Include="CSharpFunctionalExtensions.Maybe.Core" />

For only the core type.

<PackageReference Include="CSharpFunctionalExtensions.Maybe.Extensions.Common" />

For transient dependencies on the Core type and collection extension methods.

2. Abstractions

I agree that a separate abstractions project likely isn't necessary for CSharpFunctionalExtensions, because 3rd party implementations don't exist and likely never will. Additionally, the Interfaces aren't used for the extension methods. While it doesn't make sense for this project, for many other projects it does. See Microsoft.Extensions.Logging.Abstractions or Microsoft.Extensions.Hosting.Abstractions and many other projects on Nuget.

3. Change Root Namespaces

Agreed

4. Detailed Namespaces

Agreed. Namespaces should not be that detailed, it convoluts the using block. Its sensible to only have namespaces for Root projects Maybe, Result etc. and adopt the <RootNamespace/> for subordinate projects.

*** CSharpFunctionalExtensions/Maybe/Extensions/Common.csproj
<Product>CSharpFunctionalExtensions.Maybe.Extensions.Common</Product>
<RootNamespace>CSharpFunctionalExtensions.Maybe</RootNamespace>

5. Integration

Integration projects are separate projects with a PackageReference to the external and a ProjectReference to the internal projects. The Product is inferred as follows [Internal][.Submodule].Integration.[External].

For example, let's say we which to publish integration for Functional.Maybe the project file would look like so:

<Product>CSharpFunctionalExtensions.Maybe.Integration.FunctionalMaybe</Product>
<RootNamespace>CSharpFunctionalExtensions.Maybe</RootNamespace>

<ProjectReference Include="../Maybe/Core" />
<PackageReference Include="Functional.Maybe" />

The functionality to convert between Functional.Maybe and CSharpFunctionalExtensions is then implemented in

FunctionMaybeExtensions.cs, and
MaybeExtensions.cs

The naming is a bit iffy here, because of the similar product names ^^

vkhorikov commented 2 years ago

The hierarchy reflects the Nuget packages, where every node represents a package.

That's way too many packages IMO. Why would anyone import Maybe.Core but not Maybe.Extensions?

I still don't fully understand point 5, but I'll read more on this.

ProphetLamb commented 2 years ago

The reason for a separate Core project is to allow creating interoperability libraries. E.g. we convert CSharpFunctionalExtensions Maybe to the System.Nullable datatype, or Functional.Maybe datatype, without including any unrelated functionality.

This is related to point 5.

Ill try to be a bit more clear with the following concrete example: Given two libraries CSharpFunctionalExtensions.Maybe and Functional.Maybe. The goal is to convert between our Maybe type and theirs. To achieve this, we create a separate project called CSharpFunctionalExtensions.Maybe.Integration.FunctionalMaybe and import the following dependencies.

<ProjectReference Include="../Maybe/Core" />
<PackageReference Include="Functional.Maybe" />

Please note that none of our extension methods are necessary at all to achieve this, only the type Maybe.

Therefore, we don't want to include any extension methods. This is only possible by using the core project.

In the project, we create two extension methods AsMaybe with the identities

Functional.Maybe: CSharpFunctionalExtensions.Maybe and
- CSharpFunctionalExtensions.Maybe: Functional.Maybe.

This allows a user to import the above-mentioned package to convert between the types. The package will not add unwanted transient dependencies to their project, because the Core project does not have any. This allows the user to keep the deployed libraries dependencies slim. This is especially important when the user needs to manually register all dependencies.

Build Tasks
COM objects
Sourcegenerators

And many more require manual management of ALL dependencies.

ProphetLamb commented 2 years ago

When adding a new feature to a solution, the decision whether a new package should be created is made using the following criteria (I am summarizing Microsoft internal guidelines):

a new dependency is required,
a new type is created, that is not 1st degree* dependent on an existing type and does not share a common purpose.
project complexity is high, and the project allows for clean separation

Ofc to what degree CSharpFunctionalExtensions should to follow these guidelines is debatable, I am putting them in the issue, so that we can evaluate whether they are sensible for this library.

There is no additional work associated with managing one versus a thousand packages:

Azure CI/CD can pack projects and publish build artefacts to NuGet trivially.
Releasenotes can be sourced from file.
Publisher and repository specific information such as copyright, owners, URLs, sourcelink can be included using the Directory.Build.props project file.

So that the actual project file contains less than a dozen sloc. There is virtually no extra effort required for managing multiple projects, while there are usecases where a user benefits from the ability to select a specific subset of feature.

Similar discussions were held for many big open-source libraries - including the previously mentioned - and all ultimately decided for this approach.

*meaning directly included, in an identity or field

vkhorikov commented 2 years ago

Thanks for the detailed explanation, this is very helpful.

This allows a user to import the above-mentioned package to convert between the types. The package will not add unwanted transient dependencies to their project, because the Core project does not have any.

Would there be an issue if there are 2 references to the same package, one direct and the other one transient? For example:

[My Project] -> [Package 1] -> [Package 2] (Package 2 here is a transient dependency for package 1) [My Project] -> [Package 2] (The project refers to package 2 directly)

How [My Project] would view a class from [Package 2] in this case? As the same one or difference ones (given that they come from different packages)?

There is no additional work associated with managing one versus a thousand packages

I understand that there are no differences in splitting the library in 4 or 40 packages. My concern is user experience. It's quite overwhelming to see 40 options when you are trying to install a package. I would rather have this list as compact as possible.

The reason for a separate Core project is to allow creating interoperability libraries.

My other concern is YAGNI. This is the same argument as with a separate Abstractions package. I don't think we'd ever need to implement interoperability features for Functional or other similar libraries since a user who uses one such library wouldn't need another one. At the very least, I'd like to keep this separation to just Core and Extensions:

Maybe
Maybe.Core
Maybe. Extensions
Result
Result.Core
Result. Extensions
ValueObject
Entity

ProphetLamb commented 2 years ago

Dependency Graph

Would there be an issue if there are 2 references to the same package, one direct and the other one transient?

There are multiple scenarios to consider when building the dependency graph in regard to the version of [package 2]. I am assuming weak versioning.

Equivalent: The direct dependency is used for any reference
Direct > Transient: The direct dependency is used for any reference. This emits a warning during restore.
Transient > Direct: The transient dependency version is used. This emits a warning during restore. User should bump the version of the direct dependency.

Resolved dependencies

How [My Project] would view a class from [Package 2] in this case? As the same one or difference ones (given that they come from different packages)?

[package 2] will only ever be included once, according to the summary of the behaviour in [@ Dependency Graph]. The assembly for [package 2] of the specific Version, is cached during the restore process for [My Project] in the NuGet cache. The metadata of the assembly is then used throughout the project.

Further information

Many statements above need *asterisks, because manual intervention allows overriding the dependency resolution and loading at many points. Unordered clarifications below:

If [package 2] is included transiently multiple times with different versions, a direct dependency to [package 2]—fulfilling the condition above—is required to resolve the conflicting version requirements.

Referencing a package using fixed versioning or a fat package will not ever compile, because of conflicting version requirements for the former and because of an assembly load error for the latter. Neither of those technologies are ever used for NuGet packages, and I suspect most don't know it is even possible. Therefore, this is a none issue. Interesting nonetheless.

The two solutions to these conflicts (that are known to me) are using binding redirects.

app.config as described here.
Custom ResolveEventHandler here

More in-depth on binding redirects and dependencies here.

A workaround for including multiple assemblies with the same name but different versions exists

ProphetLamb commented 2 years ago

I would really appreciate, if we could have a separate project for the Task related extension methods, because of the dependency it introduces.

https://github.com/vkhorikov/CSharpFunctionalExtensions/blob/d60de307a92572ea51bbd1c9bd184188cd234f39/CSharpFunctionalExtensions/CSharpFunctionalExtensions.csproj#L45-L48

I just so happen to support a legacy project using a .Net 4 COM object. And would love to remove the dependency from the project, because I have to manually register each DLL manually extracted from the NuGet package.

vkhorikov commented 2 years ago

Thanks again for the explanation and the references.

I would really appreciate, if we could have a separate project for the Task related extension methods, because of the dependency it introduces.

So I assume this separation would be sufficient for your use case?

Maybe
Maybe.Core
Maybe.Extensions
Maybe.Extensions.Common
Maybe.Extensions.Tasks
Result
Result.Core
Result.Extensions
Result.Extensions.Common
Result.Extensions.Tasks
ValueObject
Entity

You are going to use Result.Core and Result.Extensions.Common, correct?

EDIT: Or rather, Result.Extensions.Async instead of Result.Extensions.Tasks

ProphetLamb commented 2 years ago

That sounds awesome.

You are going to use Result.Core and Result.Extensions.Common, correct?

Yes

vkhorikov commented 2 years ago

Let's go with this separation then.

jeffward01 commented 1 year ago

I hope that its not to late to make a suggestion.

I want to propose a good reason to include the .Abstractions project.

I highly suggest that it would be great to remove the 'bare' Entity and ValueObject classes, and instead implement interfaces.

For example, currently, this library is married to its implementation of Entity, as an example, or EnumValueObject. But lets focus on Entity, and Entity<T>.

Many people including myself have our own base implementations of Entity, this is very common in enterprise software to have a unique implementation of a base class per business rules. Mine looks something like this:

public abstract partial class Entity<TKey> : IGeneratesDomainEvents, IEntity<TKey>, IEntity
    where TKey : IComparable<TKey>, IEquatable<TKey>
{
    private readonly ConcurrentQueue<IDomainEvent> _domainEvents = new();

    private DateTime _createdAt;

    private DateTime _deactivatedAt;

    private DateTime _deletedAt;

    private DateTime _modifiedAt;

    public abstract object[] GetKeys();

     public TKey? Id { get; }

    public bool IsActive { get; }

    public bool IsDeactivated { get; }

    public bool IsSoftDeleted { get; }

    public bool IsDeactivatedAndDeleted { get; }

    public bool HasPrimaryKey => this.IsTransient.IsFalse();

    public abstract bool IsTransient { get; }

    public DateTime? SoftDeletedAtUtc =>
        this._deletedAt.Equals(default) ? default : this._deletedAt;

    public DateTime CreatedAtUtc => this._createdAt.Equals(default) ? default : this._createdAt;

    public DateTime? DeactivatedAtUtc =>
        this._deactivatedAt.Equals(default) ? default : this._deactivatedAt;

    public DateTime ModifiedAtUtc => this._modifiedAt.Equals(default) ? default : this._modifiedAt;

    public TKey CreatedByUserId { get; }

    public TKey? DeletedByUserId { get; }

    public TKey? DeactivatedByUserId { get; }

    public TKey? ModifiedByUserId { get; }

    public void MarkCreate(TKey createdByUserId, DateTime? utcDateTime = null) { }

    public void MarkDeactivate(TKey deactivatedByUserId, DateTime? utcDateTime = null) { }

    public void MarkSoftDelete(TKey deletedByUserId, DateTime? utcDateTime = null) { }

    public void MarkModified(TKey modifiedByUserId, DateTime? utcDateTime = null) { }

    public void ClearDomainEvents()
    {
        this._domainEvents.Clear();
    }

    public int GetRaisedDomainEventCount()
    {
        return this._domainEvents.Count;
    }

    public IEnumerable<IDomainEvent> GetAllDomainEvents()
    {
        return this._domainEvents;
    }

    public async IAsyncEnumerable<IDomainEvent> GetAllDomainEventsAsync(
        [EnumeratorCancellation] CancellationToken cancellationToken = default
    )
    {
        await foreach (
            IDomainEvent domainEvent in this._domainEvents
                .ToAsyncEnumerable()
                .WithCancellation(cancellationToken)
        )
        {
            yield return domainEvent;
        }
    }

    protected internal virtual void RaiseDomainEvent(IDomainEvent domainEvent)
    {
        this._domainEvents.Enqueue(domainEvent);
    }
}

What is most important to me, is the ability to add 'auditing', The events are also important, but that can be abstracted away to the AggregateRoot, if needed.

Currently, what I would need to do is install CSharpFunctionalExtensions into my base project, then inherit directly from this. It just is very tightly coupled.

While, if instead CSharpFunctionalExtensions used an interface, I would not need to install CSharpFunctionalExtensions in my deepest rooted project, I could instead simply inherit from IEntity and reap the benefits of doing so.

I posted more about this here: https://github.com/xavierjohn/FunctionalDDD/discussions/16

You could do so without a breaking change if the projects were broken up like this

Maybe
Maybe.Core
Maybe.Extensions
Maybe.Extensions.Common
Maybe.Extensions.Tasks
Result
Result.Core
Result.Extensions
Result.Extensions.Common
Result.Extensions.Tasks
ValueObject.Abstractions
ValueObject
Entity.Abstractions
Entity

Users could then pick and choose what they want, and how they want it.

Another added benefit is this library 'FluentResults' was inspired by @vkhorikov

Its honestly a really great error handling and result handling library, that I think CSharpFunctionalExtensions can benefit from.

Currently, it cannot be intergrated easily into CSharpFunctionalExtensions - but with an abstraction project, it can more easily be intergrated.

I myself have thought about forking CSharpFunctionalExtensions just for the sole purpose of added FluentResults -- I think many people would expand on CSharpFunctionalExtensions if given the abiltiy to do so, and I think @vkhorikov is being very humble in thinking that this wont be the case. CSharpFunctionalExtensions is a very popular and highly respected library.

jeffward01 commented 1 year ago

Another proposal I have is to change the project structure to something like this:

Maybe
Maybe.Core
Maybe.Enumerable.Extensions
Maybe.Enumerable.Extensions.Common
Maybe.Enumerable.Extensions.Tasks
Maybe.Queryable.Extensions
Maybe.Queryable.Extensions.Common
Maybe.QueryableExtensions.Tasks
Result
Result.Core
Result.Enumerable.Extensions
Result.Enumerable.Extensions.Common
Result.Enumerable.Extensions.Tasks
Result.Queryable.Extensions
Result.Queryable.Extensions.Common
Result.Queryablee.Extensions.Tasks
ValueObject.Abstractions
ValueObject
Entity.Abstractions
Entity

It might even be useful to add a Q Postfix to each of the methods, to force the compiler to render the IQueryable version, and not IEnumerable version. We know that the compiler can get confused at times, and its best to be explicit.

The Q suffix is open to debate and should be refined of course, but I really think the addition of Queryable and Enumerable projects would be good for the ecosystem.

jeffward01 commented 1 year ago

Lastly for now, am sure this will get shot down, but I propose the usage of the Async suffix. While I understand that technically the Async suffix is not needed; a few of the 'cons' of not using the Async suffix are:

Developer confusion (we get asked this often in the issues)
Compiler confusion, rendering the wrong method unless explicitly referenced - (I believe this happens)
In the spirit of explicit naming conventions, its nice to be explicit, currently it is not which leads to some confusion on await and async usage.