irods / irods_rule_engine_plugin_python

BSD 3-Clause "New" or "Revised" License
10 stars 14 forks source link

[#204] Handle invalid strings by 'blanking' variable out (4-3-stable) #207

Open MartinFlores751 opened 3 months ago

MartinFlores751 commented 3 months ago

The current implementation was to make sure the solution works (which it does). Discussion can be had on if it should be implemented as is, or if it should be moved to the templated function where the error comes from.

MartinFlores751 commented 3 months ago

Template function where the exception was thrown: https://github.com/irods/irods_rule_engine_plugin_python/blob/208ecb2b693ff8e5dca06e6ff54674d3cf0f4d17/include/irods/private/re/python.hpp#L335-L340

Also, after the rule executes, we update the variables again: https://github.com/irods/irods_rule_engine_plugin_python/blob/208ecb2b693ff8e5dca06e6ff54674d3cf0f4d17/src/main.cpp#L574-L581

Is that an issue?

korydraughn commented 3 months ago

The current implementation was to make sure the solution works (which it does).

What does works mean here? Does it mean the agent doesn't crash?

With your solution, what can users/admins expect now? Does anything appear in the log file?

Template function where the exception was thrown:

Is there a benefit to moving the try-catch inside the template function? If we did that, how would the agent react when the UTF-8 case occurs?

Is that an issue?

I'm not sure. Do a little digging to see what update_arguments does and think about how that plays into your solution.

trel commented 3 months ago

blanking does not send any signal to the rule author who may try to access a value that could not be string-ified.

not sure if we've agreed that we want to 'blank' the offending binary data, or set it to some well-defined/documented string value.

possible candidates...

MartinFlores751 commented 3 months ago

The current implementation was to make sure the solution works (which it does).

What does works mean here? Does it mean the agent doesn't crash?

Works means it does not crash, and there are no obvious unintended side effects of the solution. Funky wording because I'm not sure about update_arguments(...), but it doesn't seem to affect anything?

With your solution, what can users/admins expect now?

The affected functions should execute normally, barring the affected variable.

Some consideration may be needed if there were a real UTF-8 issue (if we even support UTF?).

Does anything appear in the log file?

No, we can log if we desire, though.

Template function where the exception was thrown:

Is there a benefit to moving the try-catch inside the template function?

Narrower scope? Less chance of catching other, possibly unrelated issues (e.g. from the <char*> template function).

If we did that, how would the agent react when the UTF-8 case occurs?

It would effectively be handled similarly, just further up the stack.

Is that an issue?

I'm not sure. Do a little digging to see what update_arguments does and think about how that plays into your solution.

Is it common to modify passed in pep arguments, and have them remain modified afterwards for rule engines? If so, then I think it would depend on how the blanked out variable is used (if at all)?

MartinFlores751 commented 3 months ago

blanking does not send any signal to the rule author who may try to access a value that could not be string-ified.

not sure if we've agreed that we want to 'blank' the offending binary data, or set it to some well-defined/documented string value.

possible candidates...

* ''

* 'SYS_NOT_SUPPORTED'

* 'NONSTRING_NOT_AVAILABLE'

* 'NONCONVERTIBLE_BINARY_DATA'

* 'CONVERTED_TO_STRING'

* 'CONVERTED_FROM_NONSTRING'

If we have something similar in the error table, I think that might be best?

A few that might fit:

korydraughn commented 3 months ago

I think the response is supposed to be treated as a raw byte sequence. Sadly, there doesn't appear to be a way to provide this distinction (i.e. we can't change the PEP signature).

Wait, we know the PEP, so it should be possible to take a special code path based on that.

trel commented 3 months ago

right, that could be a way forward. have to hold it very carefully though.