KristofferStrube / Blazor.SVGEditor

A basic SVG editor written in Blazor.
https://kristofferstrube.github.io/Blazor.SVGEditor/
MIT License
310 stars 51 forks source link

Seperating decimal numbers that have no spaces but are seperated by dots. #1

Closed KristofferStrube closed 2 years ago

KristofferStrube commented 3 years ago

Sometimes we get input like this for some instructions in path data:

a.457.318.914.317.61.78

Which should be made into:

a 0.457 0.318 0.914 0.317 0.61 0.78

The idea would be to find all places where there is a . char with either an alphabetic char before or numbers that continue until another . without any spaces.

I tried doing something like this.

string pattern = @"\.(\d+)\.";
string replacement = "$0 0.";
string input = "a.457.318.914.317.61.78";
string result = Regex.Replace(input, pattern, replacement);

but that result is:

a.457. 0.318.914. 0.317.61. 0.78

So I need to match on [a-zA-Z] optionally and I need somehow to match dots twice and fix some extra .´s.

One could also take a look at the implementation details for Path data: https://www.w3.org/TR/SVG/paths.html#PathDataBNF where they have written a full parse three in BNF that could potentially capture all of this without any edge cases like this.

But for this second approach, we need to redo the full parsing which will potentially make it tougher to read.

pcarret commented 3 years ago

I googled this : keyword regex to replace decimal point "without zero"

Went to this

Then played with the stackoverflow answer from Andreas with https://regex101.com/r/7ymqcn/1

Then finally https://dotnetfiddle.net/

using System;
using System.Drawing;
using System.Drawing.Imaging;
using Svg;
using System.Xml;
using System.Text.RegularExpressions;

public class Program
{
    public static void Main()
    {
        string pattern = @"(?:0|.[1-9][0-9]*)(:\.[0-9]{1,2})?";
        string replacement = " 0$0";
        string input = "a.457.318.914.317.61.78";
        string result = Regex.Replace(input, pattern, replacement);
        Console.WriteLine(result);
    }
}

Output is a 0.457 0.318 0.914 0.317 0.61 0.78

KristofferStrube commented 2 years ago

I will add this after I have tested its validity and create a test case that covers this.

KristofferStrube commented 2 years ago

This has been added now and tested

KristofferStrube commented 2 years ago

I wrongly only tested that my new test parsed but this fix did actually break a lot of existing tests. An example is:

string pattern = @"(?:0|.[1-9][0-9]*)(:\.[0-9]{1,2})?";
string replacement = " 0$0";
string input = "M 10 10 A 1 1 0 1 1 20 30";
string result = Regex.Replace(input, pattern, replacement);
Console.WriteLine(result);
pcarret commented 2 years ago

Hi Kristoffer,

The case complexity is due to non separation between numbers even with a simple space in :

"a.457.318.914.317.61.78";

Maybe you need two passes : one which replace 0.98 into .98 then second pass which does the opposite

Could you try passing your tests serie with this Expression :

(?:[0]?.[1-9][0-9]*)(:.[0-9]{1,2})? Here are my tests which work in the 3 cases but with two passes

image

M 10 10 A 1 1 0 1 1 20 30 a 0.457 0.318 0.914 0.317 0.61 0.78 a .457 .318 .914 .317 .61 .78 a.457.318.914.317.61.78

ercgeek commented 2 years ago

First of all thanks for sharing this project, very interesting for me to learn SVG handling in Blazor.

I'm not a RegEx expert so there may be an easier way to do this. But the following worked for me.

In PathData.cs added a new method:

    // Correct strings like "l-.004.007" and "c0-.57.464"
    private static string ValidateString(string seq)
    {
        var tokens = seq.Split(' ');
        for (int i = 0; i < tokens.Length; i++)
        {
            int numberOfPeriods = tokens[i].Count(f => (f == '.'));
            if (numberOfPeriods > 1)
            {
                int startIndex = tokens[i].IndexOf('.') + 1;
                for (int j = 1; j < numberOfPeriods; j++)
                {
                    int index = tokens[i].IndexOf('.', startIndex);
                    tokens[i] = tokens[i].Insert(index, " 0");
                    startIndex = index + 3;
                }
            }
        }
        return string.Join(" ", tokens);
    }

Then changed line 21 to: var seq = ValidateString(splitInstructionSequences[curr].TrimEnd(' '));

Successfully tried this fix with a bunch of SVGs that were failing from https://simpleicons.org/

Enrique

KristofferStrube commented 2 years ago

Hey Enrique (@ercgeek),

Glad to hear that you enjoy the project.

Thanks for the help. I have added your function and tested that it works.

Your solution is nice and easy to understand but I have left a comment above it that we maybe should look at more performant methods in the future.

Again thanks.