igraph / rigraph

igraph R package
https://r.igraph.org
555 stars 202 forks source link

Bad conversion of NULL when passing integer vector #1564

Open szhorvat opened 1 month ago

szhorvat commented 1 month ago

When passing an integer vector from R to C, the NULL value is converted to an empty vector.

This is the ultimate cause of the UBSan failure that CRAN caught,

> sample_motifs(g, 3)
vendor/cigraph/src/misc/motifs.c:719:12: runtime error: -nan is outside the range of representable values of type 'long'
    #0 0x7fdade541593 in igraph_motifs_randesu_estimate /data/gannet/ripley/R/packages/tests-clang-UBSAN/igraph/src/vendor/cigraph/src/misc/motifs.c:719:12
    #1 0x7fdadde7184f in R_igraph_motifs_randesu_estimate /data/gannet/ripley/R/packages/tests-clang-UBSAN/igraph/src/rinterface.c:8689:3
    #2 0x55ea33fe822a in R_doDotCall (/data/gannet/ripley/R/R-clang/bin/exec/R+0xf122a)
    #3 0x55ea34030024 in bcEval_loop eval.c
    #4 0x55ea3401f39b in bcEval eval.c
    #5 0x55ea3401eb24 in Rf_eval (/data/gannet/ripley/R/R-clang/bin/exec/R+0x127b24)
    #6 0x55ea34042141 in R_execClosure eval.c
    #7 0x55ea3404162b in applyClosure_core eval.c
    #8 0x55ea3401ef75 in Rf_eval (/data/gannet/ripley/R/R-clang/bin/exec/R+0x127f75)
    #9 0x55ea34076b00 in Rf_ReplIteration (/data/gannet/ripley/R/R-clang/bin/exec/R+0x17fb00)
    #10 0x55ea3407849e in run_Rmainloop (/data/gannet/ripley/R/R-clang/bin/exec/R+0x18149e)
    #11 0x55ea3407850a in Rf_mainloop (/data/gannet/ripley/R/R-clang/bin/exec/R+0x18150a)
    #12 0x55ea33f5e947 in main (/data/gannet/ripley/R/R-clang/bin/exec/R+0x67947)
    #13 0x7fdaeee2950f in __libc_start_call_main (/lib64/libc.so.6+0x2950f) (BuildId: 8257ee907646e9b057197533d1e4ac8ede7a9c5c)
    #14 0x7fdaeee295c8 in __libc_start_main@GLIBC_2.2.5 (/lib64/libc.so.6+0x295c8) (BuildId: 8257ee907646e9b057197533d1e4ac8ede7a9c5c)
    #15 0x55ea33f5e864 in _start (/data/gannet/ripley/R/R-clang/bin/exec/R+0x67864)

SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior vendor/cigraph/src/misc/motifs.c:719:12 
[1] -9.223372e+18

It's not just an UBSan warning. sample_motifs() is essentially broken, and likely so are some other functions that take optional integer vectors.

In rinterface.c, I see this:

  if (!Rf_isNull(sample)) {
    R_SEXP_to_vector_int_copy(sample, &c_sample);
    IGRAPH_FINALLY(igraph_vector_int_destroy, &c_sample);
  } else {
    IGRAPH_R_CHECK(igraph_vector_int_init(&c_sample, 0));
    IGRAPH_FINALLY(igraph_vector_int_destroy, &c_sample);
  }

It should produce a NULL pointer instead of an empty vector.

This should be changed, but be aware that doing so may cause issues elsewhere, which will need to be fixed as well.

I don't see the null check in types-RC.yaml, so I assume it comes directly from Stimulus. Can you have a look, @Antonov548 ?

szhorvat commented 1 month ago

Generally, for any OPTIONAL vector, NULL in R should be converted to a NULL pointer in C. If this is not appropriate for some function, we should either make changes in the C core (ping me) or special-case it in the R interface.

krlmlr commented 1 month ago

Thanks for the analysis. I'm not sure I follow.

https://github.com/igraph/rigraph/blob/a8717897fa5ad83cda8084b676ff409f65602e56/src/rinterface.c#L8681-L8689

In this code, line 8689 (the one shown in CRAN's backtrace) checks sample for being an R NULL, and passes 0 (a C NULL) in this case. Initialization of c_sample may be redundant, but perhaps harmless?

Can this be replicated with Docker and https://github.com/cynkra/docker-images/pkgs/container/docker-images%2Frigraph-san ?

krlmlr commented 1 month ago

Fortunately, the fix is easy.

szhorvat commented 1 month ago

Fortunately, the fix is easy.

OK, I see. I didn't read the code carefully, just observed the behaviour.

krlmlr commented 4 weeks ago

The CI shows another unrelated error, hopefully similarly simple to fix.

https://github.com/igraph/rigraph/actions/runs/11601556838/job/32304898594#step:5:6515

szhorvat commented 4 weeks ago

So what happened was that as.numeric(NULL) gives a zero-length vector, right? Which is precisely what we don't want here. I investigated this from the C core side, improved the error messages there, and when I saw zero-length vectors arriving, I assumed that R/igraph was intentionally passing those. Some C functions do interpret zero-length vectors as "nothing" but we're getting rid of that behaviour.

There are a lot of as.numeric() in the R glue code ... a lot of the in-conversion in type-RR.yaml just does as.numeric(). Can we / should we replace this with a function that keeps NULL as NULL?

In C, generally, the convention is that a vector parameter can be omitted by passing NULL for it. Not all parameters can be omitted. interfaces.yaml should indicate which one can using OPTIONAL. For some functions, it still doesn't use OPTIONAL, but specifies NULL as a default value.

When a parameter cannot be omitted, but the user still passes NULL in C, the result is a crash. So blindly passing NULL from R even for non-OPTIONAL parameters is not okay.

This should be reviewed carefully, the glue code should be adapted. What I can do on the C side is to make sure that OPTIONAL is always used when appropriate.

szhorvat commented 4 weeks ago

@ntamas What is the proper way to specify an optional vector parameter in functions.yaml? Should it be OPTIONAL SOMETHING param=NULL or is OPTIONAL SOMETHING param enough? It would make sense to me that =NULL shouldn't be necessary when OPTIONAL is already given.

szhorvat commented 4 weeks ago

See https://github.com/igraph/igraph/pull/2690 and https://github.com/igraph/rigraph/pull/1567