Open tglink72 opened 3 weeks ago
HJi @tglink72 so currently your scenario is when you export a text field with a '\' in it will be already removed in the .csv file?
Bert,
Thanks for the reply. The scenario currently is that when we export a text field with a backslash \ in it. The \ is removed. An example is in the screenshot below. The regex column has this value in a basic Synapse query. It looks like they are being removed in the initial export to the Delta folder as I do not see the \ in the Regex field in the raw csv in the delta folder.
Regex-3
^Add[A-Z]-d{5}$
But if we view the data in BC below you will see it contains \ that are not included in the query.
@.***
From: Bert Verbeek @.> Sent: Friday, September 6, 2024 2:39 AM To: Bertverbeek4PS/bc2adls @.> Cc: Tom Link @.>; Mention @.> Subject: Re: [Bertverbeek4PS/bc2adls] Characters Removed from Extracted Data - How does extract handle invalid JSON characters (Issue #169)
You don't often get email from @.*** Learn why this is importanthttps://aka.ms/LearnAboutSenderIdentification
HJi @tglink72https://github.com/tglink72 so currently your scenario is when you export a text field with a '' in it will be already removed in the .csv file?
— Reply to this email directly, view it on GitHubhttps://github.com/Bertverbeek4PS/bc2adls/issues/169#issuecomment-2333439537, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AZOTVUKZGNEHXJB44Q4WEE3ZVFLZZAVCNFSM6AAAAABNSS4PLGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGMZTGQZTSNJTG4. You are receiving this because you were mentioned.Message ID: @.***>
"The information contained in this e-mail, and any attachment, is confidential and is intended solely for the use of the intended recipient. Access, copying or re-use of the e-mail or any attachment, or any information contained therein, by any other person is not authorized. If you are not the intended recipient, please immediately return the e-mail to the sender and delete it and any attachment from your computer. Although we attempt to sweep e-mail and attachments for viruses, we do not guarantee that either are virus-free and accept no liability for any damage sustained as a result of viruses."
Thanks @tglink72 I then look if I can repo it. Hopefully I have got time for it this week.
@tglink72 I did a test with the export to MS Fabric Lakehouse. Customer comments: Then export to csv delta file: When the notebook is runned:
So with Fabric it goes well. I will also look into export of Azure Data Lake
@tglink72 I have also tested it with the synapse pipelines and Azure File Storage. But cannot reproduce it.
WHen exporting the delta's:
WHen the synapse pipeline runs and in PowerBI:
I'm using parquet files as destination.
Bert,
Thanks for the replies. I am also using synapse pipeline and Azure File storage and unfortunately, I can create it each time. The backslash is removed in the extension extract to the deltas folder. If you would like we could do a screenshare.
Thanks
Tom Link
From: Bert Verbeek @.> Sent: Monday, September 16, 2024 1:16 PM To: Bertverbeek4PS/bc2adls @.> Cc: Tom Link @.>; Mention @.> Subject: Re: [Bertverbeek4PS/bc2adls] Characters Removed from Extracted Data - How does extract handle invalid JSON characters (Issue #169)
You don't often get email from @.*** Learn why this is importanthttps://aka.ms/LearnAboutSenderIdentification
@tglink72https://github.com/tglink72 I have also tested it with the synapse pipelines and Azure File Storage. But cannot reproduce it.
WHen exporting the delta's: image.png (view on web)https://github.com/user-attachments/assets/951b7fcb-3210-4004-92ae-e6ed3a6dd0f6
WHen the synapse pipeline runs and in PowerBI: image.png (view on web)https://github.com/user-attachments/assets/51a25036-b913-43b9-93d7-73cc58d10ed2
I'm using parquet files as destination.
— Reply to this email directly, view it on GitHubhttps://github.com/Bertverbeek4PS/bc2adls/issues/169#issuecomment-2353592433, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AZOTVUKETWUYTAWKG2LXG5TZW4N45AVCNFSM6AAAAABNSS4PLGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGNJTGU4TENBTGM. You are receiving this because you were mentioned.Message ID: @.***>
"The information contained in this e-mail, and any attachment, is confidential and is intended solely for the use of the intended recipient. Access, copying or re-use of the e-mail or any attachment, or any information contained therein, by any other person is not authorized. If you are not the intended recipient, please immediately return the e-mail to the sender and delete it and any attachment from your computer. Although we attempt to sweep e-mail and attachments for viruses, we do not guarantee that either are virus-free and accept no liability for any damage sustained as a result of viruses."
Ok strange @tglink72 . Which version do you have? Indeed is it OK to have a meeting? On friday I got the whole afternoon available.
Hello,
Thanks
Tom