image-rs / imageproc

Image processing operations
MIT License
736 stars 144 forks source link

initialization of `morphology::Mask` #604

Closed Morgane55440 closed 3 months ago

Morgane55440 commented 3 months ago

in the PR #598 , the subject of the initialization of the struct morphology::Mask was raised, in particular as far as data validation is concerned, and the PR was merged before this could be resolved.

i am therefore making this issue in order to resolve the matter. (@cospectrum pinging you because you raised the issue first)

currently each constructor calls Self { elements }, and data validation comes solely from the algorithms. for example the promises of "reverse lexicographic order" and "no duplicate" is obviously fullfilled by the fact that all the constructors all use some form double loops over strictly increasing ranges, with the y value as the outer loop.

of course, modifications of the algorithms could cause these promises to be broken, therefore this is not ideal.

one thing idea that i proposed would be to add warnings about the potential safety concern through a well-documented unsafe constructors :

/// new_unchecked creates a new mask without doing any form of data validation
    /// 
    /// # Safety
    /// 
    /// this method is not user-facing and marked `unsafe` because it may lead 
    /// to invalid state if called with the wrong arguments.
    /// 
    /// By calling new_unchecked, you guarantee that :
    /// - all the integer values in `elements` will be strictly between -512 and 512
    /// - the maximum L_inf distance between any 2 points in `elements` is 512
    /// - all Points in `elements` are sorted in reverse lexicographic order, line by line ((-1,-1),(0,-1),(1,-1),(-1,0),(0,0),...)
    /// - no point appears twice in `elements`
    unsafe fn new_unchecked(elements: Vec<Point<i16>>) -> Self {
        Self { elements }
    }

this of course has the major disadvantage that it does not do any validation in itself.

therefore we should consider methods of actual validation.

this doesn't need to be in the release as it doesn't impact users, but it shouldn't take long either way

cospectrum commented 3 months ago

Already done https://github.com/image-rs/imageproc/pull/603

Safety statement should be used for unsafe, not for common invariants.

Morgane55440 commented 3 months ago

if that's your final decision, then we can close. i don't really care either way as long as the module is updated and everything works.

then should i close this issue ? i guess i mark it as duplicate ?

The following is simply an explanation of the train of thought that lead me to mark new_unchecked as unsafe, you should feel free to ignore it, just stating my biased opinion based on my understanding of the rust book.

explanation > `Safety` statement should be used for `unsafe`, not for common invariants. if `new_unchecked` is `unsafe`, then it needs a `Safety` statement. it is my understanding that invalid state is `unsafe` because it can lead to undefined behavior. if a vec had a length variable different than it's actual size in memory, that could break any program. i carefully read the definition the definition of `unsafe` and `Undefined Behavior` function from the rust book before coming to the conclusion that `new_unchecked` is in fact `unsafe`. [Unsafe Function or Method](https://doc.rust-lang.org/book/ch19-01-unsafe-rust.html#calling-an-unsafe-function-or-method) > The unsafe keyword in this context indicates the function has requirements we need to uphold when we call this function, because Rust can’t guarantee we’ve met these requirements. By calling an unsafe function within an unsafe block, we’re saying that we’ve read this function’s documentation and take responsibility for upholding the function’s contracts. [Behavior considered undefined](https://doc.rust-lang.org/reference/behavior-considered-undefined.html#behavior-considered-undefined) > Rust code is incorrect if it exhibits any of the behaviors in the following list. This includes code within unsafe blocks and unsafe functions. unsafe only means that avoiding undefined behavior is on the programmer; it does not change anything about the fact that Rust programs must never cause undefined behavior. It is the programmer's responsibility when writing unsafe code to ensure that any safe code interacting with the unsafe code cannot trigger these behaviors. unsafe code that satisfies this property for any safe client is called sound; if unsafe code can be misused by safe code to exhibit undefined behavior, it is unsound. ... > Producing an invalid value, even in private fields and locals. "Producing" a value happens any time a value is assigned to or read from a place, passed to a function/primitive operation or returned from a function/primitive operation. The following values are invalid (at their respective type): ... > - Invalid values for a type with a custom definition of invalid values. In the standard library, this affects [NonNull](https://doc.rust-lang.org/core/ptr/struct.NonNull.html) and [NonZero*](https://doc.rust-lang.org/core/num/index.html). i can completely see how this might be seen an overexertion of the intent of `unsafe`, but it is how i interpreted it.
cospectrum commented 3 months ago

We can close after merge of proptests

Morgane55440 commented 3 months ago

ok !

ripytide commented 3 months ago

I'd consider the ordering an internal implementation detail, the unit tests should catch it if the invariants are invalidated, that said debug_assert!() don't cause any harm so I guess there's no reason not to add them.

cospectrum commented 3 months ago

To close