dotnet / java-interop

Java.Interop provides open-source bindings of Java's Java Native Interface (JNI) for use with .NET managed languages such as C#
Other
200 stars 52 forks source link

Allow for bulk namespace changes #727

Closed jpobst closed 2 years ago

jpobst commented 4 years ago

Today, when generator creates a .NET namespace from a Java package, it applies a simple pascal case transformation to make the namespace match established .NET naming standards.

For example: package android.database becomes namespace Android.Database.

However there are a few scenarios that this is not a good fit for.

(1) Words where Pascal case is not desired: androidx -> AndroidX (2) Java package names are often longer than C# namespaces: com.google.android.material.animation -> Google.Android.Material.Animation.

Both of these scenarios can lead to many repeated metadata lines to fix:

<attr path="/api/package[@name='androidx.core.accessibilityservice']" name="managedName">AndroidX.Core.AccessibilityService</attr>
<attr path="/api/package[@name='androidx.core.app']" name="managedName">AndroidX.Core.App</attr>
<attr path="/api/package[@name='androidx.core.content']" name="managedName">AndroidX.Core.Content</attr>
<attr path="/api/package[@name='androidx.core.database']" name="managedName">AndroidX.Core.Database</attr>
<attr path="/api/package[@name='androidx.core.graphics']" name="managedName">AndroidX.Core.Graphics</attr>
<attr path="/api/package[@name='androidx.core.graphics.drawable']" name="managedName">AndroidX.Core.Graphics.Drawable</attr>
<attr path="/api/package[@name='androidx.core.hardware.display']" name="managedName">AndroidX.Core.Hardware.Display</attr>
<attr path="/api/package[@name='androidx.core.hardware.fingerprint']" name="managedName">AndroidX.Core.Hardware.Fingerprint</attr>
<attr path="/api/package[@name='androidx.core.internal']" name="managedName">AndroidX.Core.Internal</attr>
<attr path="/api/package[@name='androidx.core.internal.view']" name="managedName">AndroidX.Core.Internal.View</attr>
etc..

(Source)

Proposal

To help these scenarios, we could introduce a new MSBuild item that would allow simple replacements. Using MSBuild gives users the flexibility of MSBuild like conditionally specifying replacements based on target framework or importing it to multiple projects via Directory.Build.targets. Additionally it might spare some users from the complexity of metadata.

Examples:

<ItemGroup>
  <AndroidNamespaceReplacement Include='Androidx' Replacement='AndroidX' />
  <AndroidNamespaceReplacement Include='Com' Replacement='' />
  <AndroidNamespaceReplacement Include='Com.Google.' Replacement='Google' />
</ItemGroup>

Implementation Notes

These replacements would only be run for <package> elements that do not specify a @managedName attribute. If you use @managedName you are opting to provide the exact name, we will not process it further.

Unlike unused metadata, these replacement will not raise a warning if they are unused.

Case Sensitivity

Replacements take place after the automatic Pascal case transform, but the compare is case-insensitive.

Thus, both of the following are equivalent:

<AndroidNamespaceReplacement Include='Androidx' Replacement='AndroidX' />
<AndroidNamespaceReplacement Include='androidx' Replacement='AndroidX' />

Word Bounds

Replacements take place only on full words (namespace parts).

Thus,

<AndroidNamespaceReplacement Include='Com' Replacement='' />

Matches matches Com.Google.Library, but not Common.Google.Library or Google.Imaging.Dicom.

Multiple full words can be used:

<AndroidNamespaceReplacement Include='Com.Google' Replacement='Google' />
<AndroidNamespaceReplacement Include='Com.Androidx' Replacement='Xamarin.AndroidX' />

Word Position

The word part match can be constrained to the beginning or end of a namespace by appending a . or prepending a ., respectively.

<AndroidNamespaceReplacement Include='Androidx.' Replacement='Xamarin.AndroidX' />

matches Androidx.Core, but not Square.OkHttp.Androidx.

Similarly,

<AndroidNamespaceReplacement Include='.Compose' Replacement='ComposeUI' />

matches Google.AndroidX.Compose, but not Google.Compose.Writer.

Replacement Order

Replacements run in the order specified by the <ItemGroup>, however adding to this group at different times may result in an unintended order.

Replacements are run sequentially, and multiple replacements may affect a single namespace.

<AndroidNamespaceReplacement Include='Androidx' Replacement='Xamarin.AndroidX' />
<AndroidNamespaceReplacement Include='View' Replacement='Views' />

changes Androidx.View to Xamarin.AndroidX.Views.

jpobst commented 4 years ago

@jonpryor @moljac @mattleibow @Redth Was thinking about something like this and could use some feedback from more experienced binders than myself!

jonpryor commented 4 years ago

Note periods are not required, however they may help prevent unintentional matches:

I think "something" should be required; I can't think of any deliberate scenario where you'd want to have MicOSoft.

Thus, I think the value for //ns-replace/@value should start with either a ^ or a ., and that the element body should "start with" the empty string or ., matching @value:

A problem with this dichotomy is that a nested package-part of androidx won't be touched at all. I'm not sure if this is a good thing or a bad thing.

jonpryor commented 4 years ago

If we drop the string.Replace() idea and start requiring regular expressions or something more complicated, we could instead say that we only match and replace on "package-parts":

<package-part java="androidx" managed="AndroidX" />
<package-part java="accessibilityservice" managed="AccessibilityService" />
<package-part java="core" managed="Core" />
<package-part java="io" managed="IO" />
<!-- … -->

By only matching complete strings at:

We can avoid the MicrOSoft replacement scenario, while getting "reasonable" replacements "everywhere":

mattleibow commented 4 years ago

What if you also use a path="..." as well as a filter?

<ns-replace value="androidx." path="/package/path/to/type[something]">AndroidX.</ns-replace>
jpobst commented 2 years ago

@jonpryor @moljac @mattleibow @Redth Simplified the proposal based on feedback, and changed from specified with metadata to specified with MSBuild.

jonathanpeppers commented 2 years ago

If you add support for this in MSBuild, should it just be so flexible that you don't even need Metadata.xml?

For:

<attr path="/api/package[@name='androidx.core.accessibilityservice']" name="managedName">AndroidX.Core.AccessibilityService</attr>

You could put:

<AndroidMetadataAttr Include="/api/package[@name='androidx.core.accessibilityservice']" ManagedName="AndroidX.Core.AccessibilityService" />

Which you could also create "shorthand" for (this would be the equivalent):

<AndroidNamespaceReplacement Include="androidx.core.accessibilityservice" Replacement="AndroidX.Core.AccessibilityService" />

It doesn't seem like it's worth inventing a new pattern matching syntax? Should you still use xpath as before, but we create some "shortcuts"?

The only drawback is that incremental builds might be worse when .csproj files change vs Metadata.xml. We trigger a lot of targets to rerun when $(MSBuildAllProjects) changes.

jonpryor commented 2 years ago

To address the current proposal:

The word part match can be constrained to the beginning or end of a namespace by prepending a [ or appending a ], respectively.

If we need to do this, we should instead use Regular Expression characters, ^ for beginning-of-namespace, and $ for end-of-namespace. Neither ^ nor $ are valid namespace characters in C#.

<AndroidNamespaceReplacement Include='^Androidx' Replacement='Xamarin.AndroidX' />
jonpryor commented 2 years ago

@jonathanpeppers suggested using the //attr/@path value for the %(AndroidMetadataAttr.Identity) value:

<AndroidMetadataAttr Include="/api/package[@name='androidx.core.accessibilityservice']" ManagedName="AndroidX.Core.AccessibilityService" />

The conceptual problem with this approach is that it implies an exact match when an inexact match is desirable: /api/package[@name='androidx.core.view'] looks like it should only match the androidx.core.view package, and not nested packages such as androidx.core.view.accessibility, androidx.core.view.animation, or androidx.core.view.inputmethod. If you do want "starts-with" semantics anyway, you'd have to "munge" the XPath to insert a starts-with(), which significantly complicates implementation.

Even ignoring the implementation complexities, it doesn't look like what you want: it looks like an exact match, but we want an inexact match.

I don't think using the //attr/@value value is appropriate.

jpobst commented 2 years ago

On metadata:

Correct, this feature should not be implemented as metadata, for both performance and desired behavior reasons.

However, it could be passed to generator mixed inside a metadata file, and the metadata file reader could be extended to understand it, and hand it off to the appropriate implementation code.

ie, it could look something like:

<ns-replace source='androidx' replacement='AndroidX' />

Then we could extend our MSBuild support to also allow for <AndroidMetadataAttr> elements if desired.

I'm torn if that's actually a good idea or not. 😜

jpobst commented 2 years ago

If we need to do this, we should instead use Regular Expression characters, ^ for beginning-of-namespace, and $ for end-of-namespace. Neither ^ nor $ are valid namespace characters in C#.

I'm up for better ideas for than square brackets, but I'm not a fan of using RegEx characters. I do not think they have any meaning to our average target user. (I've used plenty of regular expressions in my career but did not know those (or any other regex) characters.)

My thought with square brackets was some familiarity with "ranges", even though this is clearly not the same mental construct.

NuGet versioning:

<PackageReference Include="ExamplePackage" Version="[1,3)" />

C# ranges:

var range = words[6..];

After a little more thought, I think maybe periods are my current preferred option:

Start of namespace:

<AndroidNamespaceReplacement Include='Androidx.' Replacement='Xamarin.AndroidX' />

End of namespace:

<AndroidNamespaceReplacement Include='.Androidx' Replacement='Xamarin.AndroidX' />
moljac commented 2 years ago

@jonathanpeppers

If you add support for this in MSBuild, should it just be so flexible that you don't even need Metadata.xml?

My philosophy is not to be invasive, so I used the fact that

You will see in most of my bindings Metadata.Namespaces.xml only for package names.

I did MsBuild custom task delivered via NuGet to create such Metadata.Namespaces.xml.

https://github.com/HolisticWare-Xamarin-Tools/HolisticWare.Xamarin.Tools.Bindings.XamarinAndroid.MetadataXmlSpitter/blob/master/source/HolisticWare.Xamarin.Tools.Bindings.XamarinAndroid.ApiXmlSpitter/PackagesToNamespacesSpitter.cs

and I "force" developer to verify and change more appropriate .NET namespace name:

https://github.com/HolisticWare-Xamarin-Tools/HolisticWare.Xamarin.Tools.Bindings.XamarinAndroid.MetadataXmlSpitter/blob/4d640c116057498de0be049c3507df1bc38c0a4d/source/HolisticWare.Xamarin.Tools.Bindings.XamarinAndroid.ApiXmlSpitter/PackagesToNamespacesSpitter.cs#L91

I reused our tooling and generated intermediate files (metadata.*.xml). Didn't want to clutter MSBuild. Mostly because of my MSBuild skills and it was contrary to SDK style AKA minimal style csproject concept.

This improves bindings productivity a lot, by not needing to collect nodes from api.xml.

The general idea was:

  1. Add nuget.
  2. Generate files.
  3. Modify results according to needs.
  4. Copy results (files) over if satisfied with result.
  5. Remove nuget (PrivateAssets="all") and
  6. Everything is "old normal"

Similar nuget was for decompiling support (javap, procyon, cfr, smali, krakatau...). Add nuget, build, analyse files, fix metadata, remove nuget.

Focusing on RegEx imposes requirement that the binder is RegEx expert. Reducing number of potential users (binders). Anything complex will result in expectations that we should do it.

The most time consuming steps are

here come list of my planned nice-to-haves...

jonpryor commented 2 years ago

@jpobst wrote:

I'm not a fan of using RegEx characters. I do not think they have any meaning to our average target user.

Fair.

After a little more thought, I think maybe periods are my current preferred option:

Start of namespace:

<AndroidNamespaceReplacement Include='Androidx.' Replacement='Xamarin.AndroidX' />

End of namespace:

<AndroidNamespaceReplacement Include='.Androidx' Replacement='Xamarin.AndroidX' />

I like this idea.

jonpryor commented 2 years ago

@jpobst wrote:

this feature should not be implemented as metadata … However, it could be passed to generator mixed inside a metadata file

This is almost contradictory? What is the meaning of the word "metadata" if it doesn't mean "stuff within Metadata.xml"?

So… back to the beginning. What is the intent? To simplify package renames, in particular so that nested package names are automatically renamed when a parent package name is changed.

What's the benefit of doing this as an <ItemGroup/> vs. within Metadata.xml? Why prefer one over the other?

Benefits to <ItemGroup/> include:

Benefits to Metadata.xml include:

If we thought a "Metadata-less binding project" was possible in the "near-term", I'd be personally more inclined toward an item group. Otherwise, I'm more inclined to stick it into Metadata.xml.

jonpryor commented 2 years ago

@moljac: i'm not sure if you're expressing support for using an MSBuild item group or for using Metadata.xml…. Please clarify?

moljac commented 2 years ago

i'm not sure if you're expressing support for using an MSBuild item group or for using Metadata.xml…. Please clarify?

TL&DR: expressing support for Metadata.xml (being in favor of)

I mentioned MsBuild combined with NuGet as sample od delivering/shipping new features (improvements) and being minimally intrusive. Drawback could be inner-loop performance (nuget downloads, restore, performance hits for MSBuild custom tasks)

jpobst commented 2 years ago

this feature should not be implemented as metadata … However, it could be passed to generator mixed inside a metadata file

This is almost contradictory? What is the meaning of the word "metadata" if it doesn't mean "stuff within Metadata.xml"?

Yeah, that wasn't worded real well, and requires knowledge of how our metadata system works.

I meant it will not use the current system of using XPath to manipulate an XML representation of an api (api.xml).

It will instead be applied via a new, different processor built just for this purpose. However, the user-specified configuration for this process could be given via a metadata.xml file.