Open htcfreek opened 1 month ago
Blocked by the rework of the parser (#33165 , #33085).
Possible solution using regex:
@",(?=[^""]*(?:""[^""]*""[^""]*)*$)"
Thanks to @GhostVaibhav.
Hi @htcfreek, if this is a small one, maybe I could help and open a PR quickly.
Hi @htcfreek, if this is a small one, maybe I could help and open a PR quickly.
@GhostVaibhav The problem here is the correct regex. We didn't figure it out yet. But your help is welcome.
@htcfreek, I think your regex is working fine.
@htcfreek, I think your regex is working fine.
@GhostVaibhav What happened if the comma is in the first data string. I imagine that the regex os failing then.
Another thing might be escaping the "
with ""
.
@htcfreek For the first data string, it's passing. But for escaping with double quotes, it's failing.
@htcfreek I think I found the exact implementation you're looking for.
But for escaping with double quotes, it's failing.
The last test line is wrong. It should not end with a "
or you need """
instead of ""
. Because your test line has a closing bot no opening "
in the last data element.
So, if we take that into consideration, it is passing.
Not sure if my regex (found with copilot) is the same as yours but it seems to work:
,(?=(?:[^""]*""[^""]*"")*[^""]*$)
We can add an additional check if the number of "
can be divided by 2. => The no "
is missing.
@htcfreek all the regexes mentioned in this thread are passing all the checks.
@GhostVaibhav Do you like to work on it? Should be easy to implement:
"
sign count.The file to update is <Advanced Paste Module>/Helpers/JsonHelper.cs
@htcfreek I would. But, I would take a little more time than regular devs, since it's my first time working on this module. Hope it's fine.
@GhostVaibhav Regarding the ToDos: We also have to update the split code to use the same regex.
And a small not about testing: You have to build the whole PowerToys solution.
@GhostVaibhav I found a way to remove/replace the quotation marks in csv code to get correct json version.
Example:
csv code: "my ""text"""
json code: my ""text""
json imported: my "text"
This requires the two following action in correct order: First remove single marks, then replace other ones.
1. remove: ^""(?!"")|(?<!"")""$|^""$
- First and last sign if single " AND only two "
2. replace by single ": ^""{3}(?!"")|(?<!"")""{3}$|""{2}
- first three* OR last three* OR pairs of two (* if not more than three)
Do you think this should be an extra issue/pr or does it make sense to implement with escape handling?
@htcfreek, I have no idea about this stuff, as I'm still very new. But, I think this should be a separate issue since it is no where related to this issue.
Description of the new feature / enhancement
The current implementation splits the line on every
,
character. But the csv format says that you can have a cell containing the delimiter if the cell content is enclosed in"
characters. We should not split cell data if they contain the delimiter an being enclosed with the"
character.Scenario when this would be used?
For CSV with that have the delimiter in the data values.
Supporting information
No response