Closed shiralizadeh closed 2 years ago
@firasdib Have you considered using .NET Core's AOT native image generator "CrossGen"? https://github.com/dotnet/coreclr/blob/master/Documentation/building/crossgen.md
https://github.com/dotnet/coreclr/blob/master/Documentation/botr/readytorun-overview.md
@roryap I haven't looked into it.
This project kind of dragged out and never finished, even though it was 99% complete.
@chucker Are you still too busy to help?
Hi folks, I know everyone is very busy and COVID19-wary, but is it possible to work on this in small increments? There was a lot of activity about a year ago but it's gone pretty quiet. I would love to help and am slowly catching up on how this all works.
Last I'm aware of is that it got stuck at compiling the dll's to something decently sized.
I'm sure any help would be appreciated here.
Big +1 to this feature; I use regex101's saved regex feature as a way of commenting my regexes in my code, along with the test cases. Usually, PCRE is sufficient; however, my most recent project had balancing groups in the regexes.
Interestingly, this project is a blazor client-side web-assembly project. All the regex work is being done client side in a dotnet web assembly. These can talk to and from JavaScript via JSInterop. Is this an approach that might be viable for this feature?
Whats holding this back is the bundle size. Feature wise it works. Still havent had more time to try to figure out how to shrink the lib.
Vänliga hälsningar / Best regards,
Firas Dib
On 14 Sep 2020, at 01:16, Benjathing notifications@github.com wrote:
Big +1 to this feature; I use regex101's saved regex feature as a way of commenting my regexes in my code, along with the test cases. Usually, PCRE is sufficient; however, my most recent project had balancing groups in the regexes.
Interestingly, this project is a blazor client-side web-assembly project. All the regex work is being done client side in a dotnet web assembly. These can talk to and from JavaScript via JSInterop. Is this an approach that might be viable for this feature?
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.
I think the answer lies in CoreRT: ahead of time compilation (not JIT or interpretation of IL) of the exact code necessary to run .Net’s Regex code.
See https://mattwarren.org/2018/06/07/CoreRT-.NET-Runtime-for-AOT/
Last I'm aware of is that it got stuck at compiling the dll's to something decently sized.
Correct. I used Mono-WASM directly in order to save on complexity compared to Blazor (which builds on top of that, and which mostly offers SPA things we don’t need here), but it’s a more manual process. Blazor integrated the Mono linker, which is a dead code elimination tool.
So one approach would be to use Blazor instead. I don’t think the end result will be smaller, but it might be easier to maintain.
Interestingly, this project is a blazor client-side web-assembly project.
This is not Blazor, technically.
I think the answer lies in CoreRT: ahead of time compilation (not JIT or interpretation of IL) of the exact code necessary to run .Net’s Regex code.
CoreRT is basically dead (.NET 5 will ship Mono’s AOT instead).
It’s possible, although still a bit experimental, to use AOT here. However, the main goal to that is performance, not code size.
The next step in this project is to talk to the linker correctly, or to use Blazor (and automatically get the linker).
Bumping this, anyone who want to help me get this done 😄 ?
Sure, I'll help, but still not clear on what approach has been decided on.
@roryap Nothing has been set in stone; whatever allows us to generate the smallest and most performant bundle.
@firasdib how are the php, python and golang regexes done (are they purely javascript driven)? would it be too complicated to do it via web services that run the regexes on the server in .net and return the results?
Is server-side blazor out of question due to the hosting model or the potential server load?
@Code-DJ Sorry, that part is fixed. It has to run on the client, no server side work.
Hi @firasdib have you tried bridge.net (https://github.com/bridgedotnet/Bridge) as @AnderssonPeter and @TWiStErRob have suggested. If yes, then what problems did you encounter?
You can paste your performance code on deck.net (https://deck.net) and evaluate it compared to mono/blazor.
I looked at Bridge .NET but it is not entirely clear to me if it would transpile the .NET regex engine to JS, or if it just translates C# instructions to JS and use the JS regex engine?
This issue is about that first case (running the actual .NET regex engine), not the second (using JS engine).
@Code-DJ any insight on that?
@Doqnach I am not sure, but I tried various namespaces in .NET e.g. System.Diagnostics - Stopwatch and was pleasantly surprised that it worked. It may simply be converting C# to JS equivalent as you suggested but look at the following:
https://github.com/bridgedotnet/Bridge/blob/master/Bridge/Resources/Text/RegularExpressions/RegexParser.js https://github.com/dotnet/runtime/blob/master/src/libraries/System.Text.RegularExpressions/src/System/Text/RegularExpressions/RegexParser.cs
Search for scandollar or scanoctal, looks like they have written the javascript equivalent of those methods. Doesn't mean anything. We need examples that return different results on vanilla JS and .NET to see if bridge.net returns results like .NET or like JS to see if it is a viable solution.
It's looks to me like it uses the browser's regex engine for regex's that don't use the features that are special to .NET. So a simple regex will use the JavaScript implementation as a shortcut, everything else seems to be handled by BridgeNET.
But that's just from a quick look into it, I might need to look into it further.
Doqnach notifications@github.com schrieb am Do., 7. Jan. 2021, 11:02:
I looked at Bridge .NET but it is not entirely clear to me if it would transpile the .NET regex engine to JS, or if it just translates C# instructions to JS and use the JS regex engine?
This issue about that first case, not the second.
@Code-DJ https://github.com/Code-DJ any insight on that?
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/firasdib/Regex101/issues/156#issuecomment-756015367, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAMPZ7UIAQGK2JBCVLUUQP3SYWBDHANCNFSM4AWGEDYQ .
I agree. All I want is to be able to copy and paste from C# - with C# escape sequences and stuff - but the regex engine can be javascript’s engine.
@Shane32 the code generator should cover your need in that regard.
@Shane32 Not sure if I misunderstood your comment, but this issue is entitled "C# flavor", it's not just about being able to copy/paste C# escaped sequences, it's about C# flavor expression parsing, which means seeing the results that a .NET CLR Regex engine yields. That said if what @Elsensee says is right, then it should work.
Also C# flavor is a fine title for this issue, but probably the end feature should be called .NET flavor as it'll also be applicable to F#/VB.
I've been poking this intermittently but alas to no avail, I just don't know what I am doing. I stumbled upon this link: https://www.mono-project.com/news/2018/01/16/mono-static-webassembly-compilation/ -- does this help anyone?
@chucker your example repo is no longer working for me as I can't download the files necessary as outlined by your readme. Has there been any advancements on the mono-wasm front that would allow for smaller binaries?
@chucker your example repo is no longer working for me as I can't download the files necessary as outlined by your readme.
I imagine that the dead links look like this: https://jenkins.mono-project.com/job/test-mono-mainline-wasm/label=ubuntu-1804-amd64/
If so the cause of this is that the mono project has been migrating away from jenkins to azure CI for some time and they have switched their old jenkins setup to private as part of this move. https://github.com/mono/mono/issues/20841
Unfortunately it seems to be taking a long time for their CI migration project to complete and meanwhile all of the dead jenkins links remain everywhere which is causing much confusion with many people.
The new azure CI setup is here but it seems that they're not yet publishing any built release artefacts: https://dev.azure.com/dnceng/public/_build/results?buildId=1240228&view=results
I imagine that the only way to get built mono wasm binaries at the moment would be to build it yourself.
Has there been any advancements on the mono-wasm front that would allow for smaller binaries?
https://github.com/mono/mono/issues/9857 remains open. I believe that it's currently ~1.8MB for mono wasm vs ~2MB for blazor
This is very decent: https://krausest.github.io/js-framework-benchmark/current.html
@firasdib @bebo-dot-dev the reason is likely not the CI switch but the migration from mono/mono to dotnet/runtime. https://github.com/dotnet/runtime/tree/main/src/mono/wasm seems be the more current URL. For mono-wasm.
I haven't tried to build my PoC with a newer build, though.
I could try building from source and see if I can recompile your PoC that way.
I managed to build it, but I need some hand holding to get any further. Sorry.
Has there been any progress on this?
None, unfortunately. I ran into a road block, so I would need someone to help me setup a new PoC.
@firasdib is there a way for you to share the API? That way we have an idea on what input/output you are expecting?
Just bouncing ideas. To keep the download size small, we can look into https://github.com/SteveSandersonMS/Blazor which is the beginning of Blazor. See the Questions section, Steve mentions a 300KB download vs. 3mb download for Blazor in .NET6.
That repo is missing System.Text.RegularExpressions but has things like System.IO, System.Net.Http etc. that are not needed.
What does PoC stand for?
@TimberStalker
What does PoC stand for?
Proof of concept
@Code-DJ It has to allow me to run matches and substitutions, global and non-global. I.e., match(regex, flags, string)
and substitute(regex, flags, string, replacement)
. It can of course be different depending on the language, I'm flexible.
+1 The balancing group is a great feature in.NET Regex engine, I use it to match HTML elements, it would be very nice to add C# flavor.
I stumbled upon https://github.com/dotnet/runtime/tree/main/src/mono/wasm -- is this something that can be used? Anyone who has time to experiment with it?
@firasdib I've made a simple blazor app that from javascript call a C# function
This is the app repo: https://github.com/AlbertoMonteiro/BlazorAppRegex I am using dotnet 6.0.101 I've hosted the static site using github pages, you can check it here: https://albertomonteiro.github.io/BlazorAppRegex/
As you can see it just simple evaluate a regex for a given text, return true if match false if not.
Calling the c# function from js is really simple https://github.com/AlbertoMonteiro/BlazorAppRegex/blob/454de898ab31533734ceb04a37e5caedc68d852d/index.html#L30-L34
The appName is the assembly name, in that case BlazorApp1.
In C# to be called from js, this is what I had to do: https://github.com/AlbertoMonteiro/BlazorAppRegex/blob/3f1f31d5a83264f01bd5fdb8af83b89fb6a33522/BlazorApp1/Program.cs#L11-L16
I hope this can help
I just improved the repo that I mentioned in the previous comment, you can check the last master version
Thats the same regex and value being evaluated with C# (left) and javascript on regex101(right)
@AlbertoMonteiro Thank you! Can you add a readme for how I can compile it myself?
@AlbertoMonteiro I was able to receive this error (in the alert
window, after clicking Regex match?
button) twice:
Error: No .NET call dispatcher has been set.
1 When first trying it and 2 just now, after reconfiguring and restarting windows a bunch of times. In both cases, the error disappeared after reloading the page.
Error: No .NET call dispatcher has been set.
@firasdib I've added the README, let me know if this is enough, I can provide more details if you need more help!!
@AlbertoMonteiro I was able to receive this error (in the
alert
window, after clickingRegex match?
button) twice:Error: No .NET call dispatcher has been set.
1 When first trying it and 2 just now, after reconfiguring and restarting windows a bunch of times. In both cases, the error disappeared after reloading the page.
@SunSerega I covered that issue that you faced in the readme of the repo check gh-pages section.
@AlbertoMonteiro Thank you. I tried this on my Linux machine, which has:
dotnet --version
6.0.100
Running dotnet run
I get
Building...
It was not possible to find any compatible framework version
The framework 'Microsoft.AspNetCore.App', version '6.0.1' (x64) was not found.
- No frameworks were found.
You can resolve the problem by installing the specified framework and/or SDK.
The specified framework can be found at:
- https://aka.ms/dotnet-core-applaunch?framework=Microsoft.AspNetCore.App&framework_version=6.0.1&arch=x64&rid=manjaro-x64
Adjusting the csproj-file to reflect the version I have results in the following error:
Building...
/home/firas/projects/BlazorAppRegex/BlazorApp1/BlazorApp1.csproj : error NU1102: Unable to find package Microsoft.AspNetCore.Components.WebAssembly with version (>= 6.0.100)
/home/firas/projects/BlazorAppRegex/BlazorApp1/BlazorApp1.csproj : error NU1102: - Found 40 version(s) in nuget.org [ Nearest version: 6.0.2 ]
/home/firas/projects/BlazorAppRegex/BlazorApp1/BlazorApp1.csproj : error NU1102: Unable to find package Microsoft.AspNetCore.Components.WebAssembly.DevServer with version (>= 6.0.100)
/home/firas/projects/BlazorAppRegex/BlazorApp1/BlazorApp1.csproj : error NU1102: - Found 40 version(s) in nuget.org [ Nearest version: 6.0.2 ]
The build failed. Fix the build errors and run again.
What am I doing wrong :-)?
@firasdib I am going to setup an Ubuntu and try this out there
@firasdib I've just tried it now and worked fine
Since I am using ubuntu, I used those instructions: https://docs.microsoft.com/pt-br/dotnet/core/install/linux-ubuntu#2104-
Installing with APT can be done with a few commands. Before you install .NET, run the following commands to add the Microsoft package signing key to your list of trusted keys and add the package repository.
Open a terminal and run the following commands:
wget https://packages.microsoft.com/config/ubuntu/21.04/packages-microsoft-prod.deb -O packages-microsoft-prod.deb
sudo dpkg -i packages-microsoft-prod.deb
rm packages-microsoft-prod.deb
The .NET SDK allows you to develop apps with .NET. If you install the .NET SDK, you don't need to install the corresponding runtime. To install the .NET SDK, run the following commands:
sudo apt-get update; \
sudo apt-get install -y apt-transport-https && \
sudo apt-get update && \
sudo apt-get install -y dotnet-sdk-6.0
Which distro are you using?
I use Arch.
Installing via Snap worked, thank you! My friday is gonna be fun!
After building, I can see it outputs ~40 files (3.3mb gzip), which includes a bunch of dll-files. Are all of these necessary, or is there a way to reduce the size/amount of files necessary? When I load your example website, I don't see what many files being fetched over the network, for example.
Sorry for all the questions!
Edit: Looks like these files are hard cached, checking it out in incognito will show the files being downloaded. The question then is, can some of these be omitted?
@firasdib Yeah, I am going to look into that, I know that there is possible to work with some trimming strategies, but I have to research a lite bit because I am no specialist in Blazor. But someone said that it may be possible to reduce the complete size to 1mb
@AlbertoMonteiro I'll also investigate a bit on my end. 40 files, even if they are small, can cause some serious latencies. Luckily they are only fetched once, but still.
Keep me posted on what you find :-)
If you want to research about that too, this would be a starting point: https://docs.microsoft.com/en-us/aspnet/core/blazor/host-and-deploy/configure-trimmer?view=aspnetcore-6.0
@AlbertoMonteiro look like you can enable AOT-compilation, which should reduce the amount of files, and probably improve performance
@firasdib Yeah, Steve Sanderson talks about it in this video: New Blazor WebAssembly capabilities in .NET 6
This link goes directly to the time when he starts to talk about the AOT: https://youtu.be/kesUNeBZ1Os?t=1357
@firasdib I've changed some stuff and without AOT I was able to reduce the total download size to 1.2mb gziped
I've changed the csproj file, with that new content:
<Project Sdk="Microsoft.NET.Sdk.BlazorWebAssembly">
<PropertyGroup>
<TargetFramework>net6.0</TargetFramework>
<Nullable>enable</Nullable>
<ImplicitUsings>enable</ImplicitUsings>
<!-- Remove some unused features. Shrinks the published app by ~700KB. -->
<InvariantGlobalization>true</InvariantGlobalization>
<BlazorEnableTimeZoneSupport>false</BlazorEnableTimeZoneSupport>
</PropertyGroup>
<ItemGroup>
<PackageReference Include="Microsoft.AspNetCore.Components.WebAssembly" Version="6.0.1" />
</ItemGroup>
</Project>
Program.cs
using Microsoft.AspNetCore.Components.WebAssembly.Hosting;
using Microsoft.JSInterop;
using System.Text.RegularExpressions;
_ = WebAssemblyHostBuilder.CreateDefault(args);
public static class Sample
{
[JSInvokable]
public static object SayHelloCS(string regex, string value)
{
var result = Regex.Match(value, regex);
return new
{
result.Success,
Captures = result.Captures.Cast<Capture>().Select(x => new { x.Index, x.Length, x.Value }),
Groups = result.Groups.Cast<Group>().Select(x => new { x.Index, x.Length, x.Success, x.Name, x.Value })
};
}
}
@firasdib another improvement, reduced total size to 896kb(gziped), again, without AOT
Disabled implicit usings in csproj
<ImplicitUsings>disable</ImplicitUsings>
I had to add new using in Program.cs
using Microsoft.AspNetCore.Components.WebAssembly.Hosting;
using Microsoft.JSInterop;
using System.Linq; //I HAD TO ADD THIS LINE
using System.Text.RegularExpressions;
_ = WebAssemblyHostBuilder.CreateDefault(args);
public static class Sample
{
[JSInvokable]
public static object SayHelloCS(string regex, string value)
{
var result = Regex.Match(value, regex);
return new
{
result.Success,
Captures = result.Captures.Cast<Capture>().Select(x => new { x.Index, x.Length, x.Value }),
Groups = result.Groups.Cast<Group>().Select(x => new { x.Index, x.Length, x.Success, x.Name, x.Value })
};
}
}
Hi regex101, Can you add C# to your Flavor section? If you want I can help you for this.
Thanks, Shiralizadeh