microsoft / Microsoft-Purview-Advanced-Rich-Reports-MPARR-Collector

Repository with all the MPARR components solution
MIT License
102 stars 23 forks source link

DataFormat.Error: We reached the end of the buffer. #45

Open IMmmKI opened 9 months ago

IMmmKI commented 9 months ago

Hi there,

After importing the PowerBI Report "MPARR - DLP General Overview Dic2023" template, and linking it to my workspace, I see an error relating to the DLPAll Data details.

It seems that its relating to "DataFormat.Error: We reached the end of the buffer.".

I have attached a screenshot, I can see data in the previous steps and other tables.

image

Thanks so much for all the effort in this!

Regards Sheldon

IMmmKI commented 9 months ago

Additional - I have noted that there is an error on the LA workspace. image

ProfKaz commented 9 months ago

Hi there,

After importing the PowerBI Report "MPARR - DLP General Overview Dic2023" template, and linking it to my workspace, I see an error relating to the DLPAll Data details.

It seems that its relating to "DataFormat.Error: We reached the end of the buffer.".

I have attached a screenshot, I can see data in the previous steps and other tables.

image

Thanks so much for all the effort in this!

Regards Sheldon

Hi Sheldon, For the moment what can you do to start seeing something is doing a right click on the query that have the warning symbol and disable the option enable load. That doesn´t resolve the issue but at least permit to have a first load of the reports, some visuals will be shown with some errors to fix, that is an expected behavior, because DLP Details query contains the sensitive info types detected. After you do that, we can try to see if we can fix the issue that can have more than one possible trigger. To do these steps, you need to back to the same interface and in the right side we need to check all the steps applied to the query, you can do that going from top to down in all the steps listed, please share the results.

ProfKaz commented 9 months ago

Additional - I have noted that there is an error on the LA workspace. image

This error is an expected behavior that I´m investigating if we have an option to resolve in Logs Analytics, that happens because you can have a DLP rule that is looking for many different kind of sensitive information types or a match with several. thousands, of matches in one item, the service collects all the matches in a field, when we were requesting that information the size is too big for Logs Analytics. The idea is to see if we can split that in some way or manage some changes in Logs Analytics, nevertheless, I´m aware of this issue.

IMmmKI commented 9 months ago

Hi there, After importing the PowerBI Report "MPARR - DLP General Overview Dic2023" template, and linking it to my workspace, I see an error relating to the DLPAll Data details. It seems that its relating to "DataFormat.Error: We reached the end of the buffer.". I have attached a screenshot, I can see data in the previous steps and other tables. image Thanks so much for all the effort in this! Regards Sheldon

Hi Sheldon, For the moment what can you do to start seeing something is doing a right click on the query that have the warning symbol and disable the option enable load. That doesn´t resolve the issue but at least permit to have a first load of the reports, some visuals will be shown with some errors to fix, that is an expected behavior, because DLP Details query contains the sensitive info types detected. After you do that, we can try to see if we can fix the issue that can have more than one possible trigger. To do these steps, you need to back to the same interface and in the right side we need to check all the steps applied to the query, you can do that going from top to down in all the steps listed, please share the results.

Hi Sebastián,

Thank you. That worked, I skipped the error row and I was able to solve it.

dane6375 commented 8 months ago

We have the same error here. Our DLP Polices contains at least 3 or 4 rules, and then most of our DLP events trigger an error in the "PolicyDetails_s" column that contains a JSON of the matching policy and rules.

If we add a "remove error" in the Query, we can get report, but of course we miss a lot of data.

MPARR

I'm wondering if Log Analytics is the right solution to store these events as the limit of 32K characters is a hard, by design, limit of Log Analytics

ProfKaz commented 8 months ago

We have the same error here. Our DLP Polices contains at least 3 or 4 rules, and then most of our DLP events trigger an error in the "PolicyDetails_s" column that contains a JSON of the matching policy and rules.

If we add a "remove error" in the Query, we can get report, but of course we miss a lot of data.

MPARR

I'm wondering if Log Analytics is the right solution to store these events as the limit of 32K characters is a hard, by design, limit of Log Analytics

Thanks for reaching me out, I´m analyzing the best way to generate this report considering that issue with the buffer, I´m researching for other ways to get the data avoiding that kind of error, please be patience, is an open-source solution that I´m working on my free time.

dane6375 commented 8 months ago

Hi Sebastian, Thanks for your quick response. I understand that's a lot of work and really appreciate what you've done so far. Do you think that using an eventhub with MPARR V2 would solve this issue ? What solution would you suggest then to capture the eventhub? Azure DataLake ?

ProfKaz commented 8 months ago

Hi Sebastian, Thanks for your quick response. I understand that's a lot of work and really appreciate what you've done so far. Do you think that using an eventhub with MPARR V2 would solve this issue ? What solution would you suggest then to capture the eventhub? Azure DataLake ?

Yes sending to Event Hub inclusive you can filter the data, or you can apply Transform Rules when the data is ingested, Data lakes can be an option, normally for all that kind of solutions you need to set the structure of the data, in that order of ideas, all the scripts have the option to export to JSON format that permit to see the structure of the data. You can execute any of the scripts using the attribute -ExportToJSONFile.