igraph / rigraph

igraph R package
https://r.igraph.org
532 stars 200 forks source link

feat: remove deprecated functions before 1.0 #1352

Closed Antonov548 closed 2 weeks ago

Antonov548 commented 2 months ago

Deprecated

Updated

aviator-app[bot] commented 2 months ago

Current Aviator status

Aviator will automatically update this comment as the status of the PR changes. Comment /aviator refresh to force Aviator to re-examine your PR (or learn about other /aviator commands).

This PR was merged manually (without Aviator). Merging manually can negatively impact the performance of the queue. Consider using Aviator next time.


See the real-time status of this PR on the Aviator webapp.
Use the Aviator Chrome Extension to see the status of your PR within GitHub.
Antonov548 commented 2 months ago

@krlmlr This is related to https://github.com/igraph/igraph/issues/2575#issuecomment-2083059712

Is it right that we should just remove deprecated functions?

szhorvat commented 2 months ago

I edited the top post and added information on what can / should be done.

Antonov548 commented 2 months ago

I edited the top post and added information on what can / should be done.

Thanks!

szhorvat commented 2 months ago

Let me know if you need input here, for example on what parameters to set in the new function to get the behaviour of the old one.


Regarding hub and authority scores, I expect that removing these without deprecation would cause quite the uproar. What could be done is to implement hub_score() and authority_score() in pure R in terms of hits_scores() and keep them deprecated for a while.

There will also be another change in these two functions, as well as eigen_centrality() in 1.0: the scale parameter will be removed, and the function will always behave as if scale=TRUE was set. I'll open a new issue for preparing for this change, but I suggest NOT to expose the scale parameter at all when introducing hits_scores().

Antonov548 commented 2 months ago

@szhorvat

szhorvat commented 2 months ago
  • I think we still need to remove hub_score() and authority_score() because implementing same functions, but using hits_scores() still break interface on R side, so maybe it worth just to remove them.

Why would it break the interface? hits_scores() returns both scores. Just drop one and rewrite the result to the format that hub_score() used to return.

As for the scale parameter, allow the parameter but ignore its value. We never made any promises about the type of scaling used with scale=FALSE.

Antonov548 commented 2 months ago
  • I think we still need to remove hub_score() and authority_score() because implementing same functions, but using hits_scores() still break interface on R side, so maybe it worth just to remove them.

Why would it break the interface? hits_scores() returns both scores. Just drop one and rewrite the result to the format that hub_score() used to return.

As for the scale parameter, allow the parameter but ignore its value. We never made any promises about the type of scaling used with scale=FALSE.

Because hub_score() and authority score() returns not just a vector. Of course we can maybe try to keep previous behavior by creating the similar list. https://github.com/igraph/rigraph/pull/1352/files#diff-aa8edbd6189f5557d42166c470f065672ba52363e61d84dcd7bd3200c3b3420cL3860-L3862

Sorry, I didn't get regarding laplacian. It has also mode=c("out", "in", "all", "total") and igraph_laplacian hasn't this argument.

szhorvat commented 2 months ago
  • Should then functions which are using igraph_get_laplacian to be extended with additional argument mode?

This will require a somewhat breaking change (which is why I kept pushing to do this for 2.0...)

The big change is not mode but the normalization parameter, which is NOT the same as normalized. The old igraph_laplacian() normalized in an inconsistent way. You can see what it did here:

https://github.com/igraph/igraph/blob/223fd0e2c11b4c0a35452ffa68347206771dc0b4/src/properties/spectral.c#L403-L431

We need to transition from the old boolean normalized to the new normalization in some way. I'm not sure what the customary way is in R. I guess we can temporarily have both normalized and normalization at the same time. Deprecate normalized, i.e. any explicit use of normalized should trigger a deprecation warning. I'm not sure what to do when the user passes both normalized and normalization at the same time: I guess throw an error since this makes no sense?

szhorvat commented 2 months ago

Re hub and authority scores:

Currently the function returns a list with many named elements. The main result is in $vector. hits_scores() would return the same, except that instead of vector it'd have both $hub and $authority. hub_score() would call hits_scores(), would remove $authority and would rename $hub to $vector. The rest of the entries are the same between the two functions.

Is there anything I'm missing that may continue to be a problem?

szhorvat commented 1 month ago

I notice you were working on make_lattice. Please see this before going ahead: https://github.com/igraph/rigraph/issues/994

The R interface shouldn't always directly mirror the raw C interface. The periodic parameter should accept either a single logical value or a vector. The circular parameter probably shouldn't be removed immediately, breaking compatibility, but should go through a gradual deprecation.

Antonov548 commented 1 month ago

I notice you were working on make_lattice. Please see this before going ahead: #994

The R interface shouldn't always directly mirror the raw C interface. The periodic parameter should accept either a single logical value or a vector. The circular parameter probably shouldn't be removed immediately, breaking compatibility, but should go through a gradual deprecation.

Sure. I think derpecation of old parameter name is quite easy.

Can you clarify please about single logical value or a vector? You mean it should be handled on R side? Because in core as I understood it could be NULL or vector with exact same length as dimension.

szhorvat commented 1 month ago

Can you clarify please about single logical value or a vector? You mean it should be handled on R side? Because in core as I understood it could be NULL or vector with exact same length as dimension.

Yes, it should be handled in R. I would not allow NULL at all. Instead:

The default should be FALSE.

This is both very convenient (in most cases people will pass a scalar only) and it is compatible with how the old circular parameter behaved.

szhorvat commented 1 month ago

In the top post I linked to relevant issues where I tried to describe what's needed for the non-trivial upgrades. If anything's unclear (as in this case) please ask :-)

Antonov548 commented 1 month ago

@maelle May I ask you to help with documentation in this PR? I trying to update documentation and have error:

✖ In topic 'authority_score.Rd': Skipping; no name and/or title.
✖ In topic 'hub_score.Rd': Skipping; no name and/or title.

I don't get what is missing in documentation, mentioned functions have rdname.

maelle commented 1 month ago

I'll come back to this next week!

maelle commented 1 month ago

Do we really want to remove the functions though? Shouldn't they be deprecated first?

maelle commented 1 month ago

Above authority_score and hub_score you need

#' @title Kleinberg's hub and authority centrality scores.
krlmlr commented 1 month ago

For the test failure, the problem and solution pattern could look like this:

wrapped <- function(a, b, c) {
  list(a, b, c)
}

fun <- function(..., b = NULL, c = NULL) {
  wrapped(..., a = 5, b = b, c = c)
}

fun(b = 2, c = 3)
#> [[1]]
#> [1] 5
#> 
#> [[2]]
#> [1] 2
#> 
#> [[3]]
#> [1] 3
fun(a = 1, b = 2, c = 3)
#> Error in wrapped(..., a = 5, b = b, c = c): formal argument "a" matched by multiple actual arguments

fun <- function(..., a = NULL, b = NULL, c = NULL) {
  a <- if (is.null(a)) 5 else a
  wrapped(..., a = a, b = b, c = c)
}

fun(a = 1, b = 2, c = 3)
#> [[1]]
#> [1] 1
#> 
#> [[2]]
#> [1] 2
#> 
#> [[3]]
#> [1] 3

Created on 2024-05-30 with reprex v2.1.0

maelle commented 1 month ago

@Antonov548 I'll check out this PR locally and make the changes

Antonov548 commented 1 month ago

@Antonov548 I'll check out this PR locally and make the changes

  • no longer removing the functions
  • doing the argument adaptation mentioned above by Kirill
  • ...

Thanks. I have restored functions. Just pushed changes. Yes, please, take a look.

szhorvat commented 1 month ago

In sample_gnm() / sample_gnp(), the type1 and type variables can now be removed. type1 is not used and the value of type can be inlined, simplifying code.

szhorvat commented 1 month ago

In laplacian_matrix(), the documentation need to be updated based on:

What's most important is to include the formulas for the normalization methods. The text is not sufficient to make it clear what these methods do.

szhorvat commented 1 month ago

Can you rebase this on latest main so it'd be easier to test locally?

maelle commented 1 month ago

I first want to make r cmd check pass, it's not ready yet

szhorvat commented 1 month ago

It seems you copied the C enum constants directly ... these are not meaningful in R. What's needed is to amend the existing docs on normalization methods and explain what each one is, using the same names that the R function takes.

maelle commented 1 month ago

Ok I'll revert and keep this for later #1384

maelle commented 1 month ago

@Antonov548 can you please update the checklist at the very top and indicate which issues will be fixed by this PR?

maelle commented 1 month ago

@Antonov548 I left some questions for you!

Antonov548 commented 1 month ago

@Antonov548 can you please update the checklist at the very top and indicate which issues will be fixed by this PR?

All items in checklist are done. Updated list shows functions which were updated and corresponded list of R side - description what happened to deprecated function.

Antonov548 commented 1 month ago

It's green. Thanks a lot @maelle

maelle commented 1 month ago

Could you please add the "Fix #" comment to link the issues to the PR then @Antonov548? Thanks a ton

szhorvat commented 1 month ago

Ok I'll revert and keep this for later #1384

Do you have a copy of that commit so I could edit it a bit and fix it up>?

krlmlr commented 1 month ago

Running revdepchecks.

For real, now.

szhorvat commented 1 month ago

Once this is done, can you please submit a PR to the C core removing all no-longer-needed deprecated entries from https://github.com/igraph/igraph/blob/master/interfaces/functions.yaml and https://github.com/igraph/igraph/blob/master/interfaces/types.yaml ?

krlmlr commented 1 month ago

Added revdepcheck results. We have new failures, but they might come from the main branch. Double-checking.

szhorvat commented 1 month ago

One of the issues is that in hub_and_authority_scores, the the generated C core, hub_vector is not renamed to hub.vector. In the R code it is. Either this should be fixed in Stimulus, or the code can be fixed manually and autogen disabled for this function.

I would go with a manual patch for the time being, so we can move ahead with the release. Stimulus can be fixed up with some options to control _ vs . translation later. I have some vague memory about this being an issue in the past, but don't remember with which function. I think it was the opposite: we didn't want the .-renaming, but it did it anyway for compatibility. Here we probably do want it for consistency with other functions.

szhorvat commented 1 month ago

The other issue is that the method value renaming in sample_degseq() went a bit wrong and no longer accepts the old names due to the method <- igraph.match.arg(method). Example:

> sample_degseq(c(1,1,2,2,2),method='simple.no.multiple')
Error in `sample_degseq()`:
! `arg` must be one of "configuration", "vl", "fast.heur.simple",
  "configuration.simple", or "edge.switching.simple", not "simple.no.multiple".
Run `rlang::last_trace()` to see where the error occurred.

This is actually on main, not on this branch.

It should instead issue a deprecation warning. Can you have a look at this one @maelle ?

Antonov548 commented 1 month ago

One of the issues is that in hub_and_authority_scores, the the generated C core, hub_vector is not renamed to hub.vector. In the R code it is. Either this should be fixed in Stimulus, or the code can be fixed manually and autogen disabled for this function.

I would go with a manual patch for the time being, so we can move ahead with the release. Stimulus can be fixed up with some options to control _ vs . translation later. I have some vague memory about this being an issue in the past, but don't remember with which function. I think it was the opposite: we didn't want the .-renaming, but it did it anyway for compatibility. Here we probably do want it for consistency with other functions.

Maybe it's still better to fix it in Stimulus? I think it's not much changes in stimulus, but I see that it's this issue no only in hit_scores() function.

maelle commented 1 month ago

https://github.com/igraph/rigraph/pull/1393

krlmlr commented 1 month ago

The revdepcheck failures are a mismatch between hub_score (provided by interface.c) and hub.score (expected by aaa-auto.R) . Can you please resolve?

Moving forward, I'd like to have more tests with the "add.vertex.names" option set to TRUE . I'll see what our tests do if we enable this globally.

maelle commented 4 weeks ago

Relatedly, do we want to make the deprecation of hub.score() and authority.score() a soft deprecation instead?

maelle commented 4 weeks ago

hub.vector not score (note to self)

maelle commented 4 weeks ago

update hits_score() examples (they use the deprecated functions)

szhorvat commented 4 weeks ago

Please make all graphs directed in the examples. This is important. These metrics make no sense for undirected graphs, and the examples confuse users.

Also, stars and rings are both bad examples. I suggest using sample_gnm() with at least twice as many edges as vertices, and make it directed.

szhorvat commented 4 weeks ago

Or better: use macaque from igraphdata.

maelle commented 4 weeks ago

@Antonov548 I do not understand why aaa-auto.R refers to hub.vector. How exactly is the body of hub_and_authority_scores_impl() created? In functions-R.yaml I only see

igraph_hub_and_authority_scores:
    # This is a temporary hack; we need to find a way to handle default values
    # for dependencies. The problem is that the VERTEXINDEX type needs two
    # dependencies: the graph and the vertex set, but we only have the graph
    # in the argument list.
    DEPS: weights ON graph, hub_vector ON graph V(graph), authority_vector ON graph V(graph)

How do we best patch hub_and_authority_scores_impl()? It should refer to names with underscores not with dots. Thank you!

maelle commented 4 weeks ago

The example I tried adding pairs as a reprex of what's broken @Antonov548

library("igraphdata")
library("igraph")
#> 
#> Attaching package: 'igraph'
#> The following objects are masked from 'package:stats':
#> 
#>     decompose, spectrum
#> The following object is masked from 'package:base':
#> 
#>     union
data(macaque)
hits_scores(macaque)
#> This graph was created by an old(er) igraph version.
#>   Call upgrade_graph() on it to use with the current igraph version
#>   For now we convert it on the fly...
#> Error in names(res$hub.vector) <- vertex_attr(graph, "name", V(graph)): attempt to set an attribute on NULL

Created on 2024-06-07 with reprex v2.1.0

Antonov548 commented 4 weeks ago

@Antonov548 I do not understand why aaa-auto.R refers to hub.vector. How exactly is the body of hub_and_authority_scores_impl() created? In functions-R.yaml I only see

igraph_hub_and_authority_scores:
    # This is a temporary hack; we need to find a way to handle default values
    # for dependencies. The problem is that the VERTEXINDEX type needs two
    # dependencies: the graph and the vertex set, but we only have the graph
    # in the argument list.
    DEPS: weights ON graph, hub_vector ON graph V(graph), authority_vector ON graph V(graph)

How do we best patch hub_and_authority_scores_impl()? It should refer to names with underscores not with dots. Thank you!

To not spend time to make changes in stimulus I can propose following change. Please check last commit.

I found a place in stimulus where _ is replaced by dot. Seems this logic not synchronized with C code generation.

Antonov548 commented 3 weeks ago

He... I'm not sure what with GHA here. Maybe something wrong with cache. @krlmlr may you suggest please?