GoogleCloudPlatform / dlp-dataflow-deidentification

Multi Cloud Data Tokenization Solution By Using Dataflow and Cloud DLP
Apache License 2.0
89 stars 53 forks source link

Updating the JsonObject method from getAsString() to toString() #184

Closed chitara-01 closed 10 months ago

chitara-01 commented 11 months ago

Summary (Short summary of what is being done) :

Updating the JsonObject method from getAsString() to toString()

Description (Describe in detail the fix made) :

The DEID pipeline was failing on JSONL files when it had nested structure.

Pipeline uses getAsString() method call on JsonObject which throws exception with non-primitive data (nested structure in this case). Replacing getAsString() with toString() works. Please refer to [this](https://stackoverflow.com/questions/34120882/gson-jsonelement-getasstring-vs-jsonelement-tostring#:~:text=getAsString()%20is%20only%20defined,on%20all%20types%20of%20JsonElement%20.) stackoverflow link to understand the difference between the two.

Bug ID (if any) :

b/310247478

Public Documentation (if any) :


TESTED (Test Cases with scenario and description - must have 1 positive and 1 negative scenario) :

Tested DEID pipeline on the jsonl data file attached with the bug.

codecov[bot] commented 11 months ago

Codecov Report

Attention: 1 lines in your changes are missing coverage. Please review.

Comparison is base (65fb1e4) 13.41% compared to head (01c6728) 13.41%.

Files Patch % Lines
...m/tokenization/json/ConvertJsonRecordToDLPRow.java 0.00% 1 Missing :warning:
Additional details and impacted files ```diff @@ Coverage Diff @@ ## master #184 +/- ## ========================================= Coverage 13.41% 13.41% Complexity 67 67 ========================================= Files 53 53 Lines 2519 2519 Branches 213 213 ========================================= Hits 338 338 Misses 2161 2161 Partials 20 20 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.