vladmihalcea / hypersistence-utils

The Hypersistence Utils library (previously known as Hibernate Types) gives you Spring and Hibernate utilities that can help you get the most out of your data access layer.
Apache License 2.0
2.39k stars 366 forks source link

Mechanism to return null on deserialization error #561

Closed nightswimmings closed 1 year ago

nightswimmings commented 1 year ago

I miss a way to tell a type to return null on deserialization error

For instance, if you are doing a fooRepository.streamAll(), and Foo is defined as

@Table(name = "foo")
public class Foo {

    @Type(type = JSONB_TYPE)
    @Column(columnDefinition = "jsonb")
    Bar bar;
}

if the database contains a foo table's row with a bar column value that is malformed json, the whole operation fails. I could not find anyway to signal this "skip on error" logic on hyperistence nor jackson, for instance setting bar to null in the object foo and continue extracting other registries from the db

vladmihalcea commented 1 year ago

Because you are using a jsonb column, the JSON object should be validated. I think this problem should happen only for json columns that store the JSON object as String.

All databases that store the JSON as String allow you to validate the object to ensure Consistency.

So, I don't see how this issue can ever happen when you are storing JSON in binary format.

nightswimmings commented 1 year ago

You are right I defined the incorrect type, my underlying database column is indeed json, not jsonb. , so perhaps that is why I am facing the problem.

Is there anyway to tell hypersistence to fail on a record basis?

vladmihalcea commented 1 year ago

Is there anyway to tell hypersistence to fail on a record basis?

Just use PostgreSQL to cast the JSON column to JSONB, and it will fail on the record that's inconsistent.

nightswimmings commented 1 year ago

Hi @vladmihalcea , I accepted the answer because by that time I could not review it and what you said made sense (you already spotted an issue of mine in the previous comment). But I had time to review the issue again after the change and I realised the problem was not on the type. No matter the Postgres DB column type and the @TypeDef class typeClass (json/jsonb), it always fails.

By malformed I did not mean syntactically incorrect json, but semantically one. Indeed I tested it, and Postgres validates json is valid with json column which was always the original.

The issue is about a valid-json schema that does not conform to the resolved dto @Type. For instance, in my original example, imagine Bar contains an UUID field, but in the db comes '{"field":"rrrr"}'. This produces an IllegalArgumentException wrapped by an Hibernate exception, and if you are using a repository method that returns Stream, when I use hypersistence-utils's on-the-fly deserialization, an error disrupts all stream.

Id like to register an ExceptionHandle or something so in case the deserialization fails for any registry, a log.warn is issued and the entry is skipped from the stream. Otherwise, if I were doing an operation over 100000 entries, as soon as one single schema is obsolete, the whole service would break until I fix the record manually

Is this possible?

vladmihalcea commented 1 year ago

Anything is possible, and since you have access to the source code, you can investigate the best way you can achieve your goals.

Your problem is application-specific, and should be solved in your app. The goal of these generic Types is to save the object to the DB column and read it back.

How you validate the content is up to you.

nightswimmings commented 1 year ago

Hi Vlad! Thanks a lot for the answer. I really appreciate your comment, and I will deep in the code if there is no obvious solution although I guess if I opened the ticket is because I did not see an easy one at the very moment.

But honestly, I don't think it is an application-specific issue, because the logic that handles this is hypersistence. It is not related to validation, but the fact that database is schema-less and a @JsonType resolved as a DAO is, by nature.

IMO, there should be a very easy way to hook an exception handling on record deserialization without having to hack the internals of hypersistence with custom deserialization logic. I mean, after all, my situation is a legit need I am pretty sure some people have met.

It's just a way to say "if for some reason, a record cannot be deserialized, do X, specifically in the hypersistence-domain". Indeed, the fact that the Exception is an IllegalArgument wrapped by HibernateException, makes me think that this is all dealt by the low-level

vladmihalcea commented 1 year ago

You can implement such a mechanism and provide a way to integrate it.

Send me a Pull Request with your solution when it's ready to review it.

nightswimmings commented 1 year ago

Ok, I think I managed to work it out, it is not neat, but it saves us from having to touch the code:

In hypersistence-utils.properties hypersistence.utils.jackson.object.mapper=org.personal.utils.JsonTypeSilentFailingObjectMapper

@Slf4j
public class JsonTypeSilentFailingObjectMapper extends ObjectMapper {

    // Taken from original Hypersistence's default  ObjectMapperWrapper.ObjectMapper constant, which unfortunately has a private accessor
    private static final ObjectMapper OBJECT_MAPPER = (new ObjectMapper()).findAndRegisterModules().registerModule((new SimpleModule()).addSerializer(OffsetDateTime .class, ObjectMapperWrapper.OffsetDateTimeSerializer.INSTANCE).addDeserializer(OffsetDateTime.class, ObjectMapperWrapper.OffsetDateTimeDeserializer.INSTANCE));

    public JsonTypeSilentFailingObjectMapper() {
        super(OBJECT_MAPPER);
    }

    @Override
    public <T> T readValue(String content, JavaType valueType) {
        try {
            return super.readValue(content, valueType);
        } catch (IOException e) {
            log.debug("Field {} was impossible to deserialize into known schema, returning null: {}", valueType.getRawClass().getCanonicalName(), e.getMessage());
        }
        return null;
    }
}

I was tempted to do a PR and add a fail-silently setup but I saw the project is composed of modules for several versions, backporting is needed, and testing is not trivial (and I need to configure git with anonymous identity). But it will be just a mather of adding this to ObjectMapperWrapper.class

        private boolean failDeserializationSilently = false;

       public void setFailDeserializationSilently(boolean failDeserializationSilently) {
          this.failDeserializationSilently = failDeserializationSilently;
       }

      //... and something like this to the 4 readValue methods:

      } catch (IOException e) {
            if (failDeserializationSilently) {
                if (LOGGER.isInfoEnabled()) {
                    LOGGER.info("The given string value: '{}' cannot be transformed to {} Json object, returning null...", string, type.getTypeName(), e);
                } else {
                    LOGGER.warn("The given string value cannot be transformed to {} Json object, returning null...", type.getTypeName());
                }
            } else {
                throw new HibernateException(
                        new IllegalArgumentException("The given string value: " + string + " cannot be transformed to Json object", e)
                );
            }
            return null;
        }

and this to Configuration.class

        FAIL_DESERIALIZATION_SILENTLY(
                "hypersistence.utils.json.deserializer.fail-silently",
                null
        );

        String failDeserializationSilently = PropertyKey.FAIL_DESERIALIZATION_SILENTLY.resolve(properties);
        if ("true".equalsIgnoreCase(failDeserializationSilently)) {
            objectMapperWrapper.setFailDeserializationSilently(true);
        }
vladmihalcea commented 1 year ago

I'll take a look at it once you send me a Pull Request for all modules with tests that proves it works.

In the meantime, you can use your solution on your project and see how it works for various scenarios.

nightswimmings commented 1 year ago

Ill try to find the time Vlad!

vladmihalcea commented 1 year ago

Is this urgent for your company?

nightswimmings commented 1 year ago

It's a personal thing, and I already work it out with the solution at the beginning Anyway, looks like once migrated to spring-boot 3 we got hibernate 6 which has this feature built-in, so I guess my cornercase does not deserve that much effort

vladmihalcea commented 1 year ago

It's a personal thing, and I already work it out with the solution at the beginning

In that case, there's no rush. I'll leave this issue Open for several months, and if no one provides a PR for it, I'll close it as there's no point in keeping an issue open if there's no one willing to fix it.

Anyway, looks like once migrated to spring-boot 3 we got hibernate 6 which has this feature built-in, so I guess my cornercase does not deserve that much effort

Hibernate 6 doesn't have this feature built-in. The native JSON support in Hibernate 6 is extremely limited. Not only that, it doesn't allow you the flexibility of choosing String, List, Map, and POJOs on the entity side, but on the DB side, the JSON support is limited to PostgreSQL. The JsonType in this project works on PostgreSQL, Oracle, SQL Server, MySQL, and H2 with no change on the entity side. Just give it a try and see for yourself.

nightswimmings commented 1 year ago

Nice to know, Vlad!