Closed pat-s closed 4 years ago
@krzyslom can you look at this? We can start by porting this as is, and then gradually replacing parts with the C++ code.
@zzawadz I'll investigate it next weekend.
any update here? :)
Hey @pat-s
I took a look at the function a while ago. Here are my thoughts.
if (sample.size < 1) {
warning(paste("Assumed: sample.size = ", sample.size))
sample.size = 1
sample_instances_idx = sample(1:instances_count, 1)
}
The function prints passed value while calling it assumed. Next the value is silently changed. This behavior occurs in several steps. Moreover we can pass double
or a negative number to sample.size
which does not make sense. Perhaps it is better to throw an error is such cases.
There are several not exported functions that also have to be implemented in the package e.g. get.data.frame.from.formula
and normalize.min.max
.
relief
function is composed from several inner functions. Results of their call are assigned to global variables with <<-
operator. The question is should we just try to make a copy-paste for this functionality, or should we try to redesign the function with future C++ interface in mind. Unfortunately I am not familiar with C++ so I do not have a good judgement in that case. I'm not sure if @zzawadz would write Rcpp implementation in similar fashion.
@zzawadz please take a look at the topic.
Sorry for the late follow-up.
I have no detailed knowledge about the algorithm but we would like to use a Rcpp version of it in both {mlr} and {mlr3filters}.
I just searched again and there is currently no other package implementing the RELIEF algorithm in R.
Compared to other filters, the java-based {FSelector} implementation takes really long. It would be a good enhancement if we could port this to {FSelectorRcpp}. This filter method is still able to achieve good results in benchmarks and I really think the effort would be worth it.
Sometimes a rewrite is even easier than porting existing code - but you are the experts here, I have no clue about the internals and know only very few about Rcpp.
@zzawadz Did you have time to look at all of this in the meantime?
CC @zzawadz
bump :)
I promise I'll loot at it. I think that I'll start from copying the whole function from FSelector and start replacing its parts with C++ code.
My plan is like this:
1) contact author of this function and get approval for adapting his code? (do i need this? it's open source so maybe just mentioning the author will be sufficient - I have spent too much time in BigCo to be sure).
2) Adapt the code to use more FSelectorRcpp's like interfaces.
3) Remove <<-
assignment (It's hard for me to reason about this code where it's used).
4) Rewrite hottest parts in cpp?
@pat-s what do you think?
Sure, whatever works :) You are th C++ expert here 👍
I don't think you need to contact the author. Just provide a comment its taken crom FSelecote for now. On the other hand the author of FSelector vanished. The package just have a maintainer that looks after abandoned package and that's not the original author
On Sat, 18 Jul 2020, 15:55 Patrick Schratz, notifications@github.com wrote:
Sure, whatever works :) You are th C++ expert here 👍
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/mi2-warsaw/FSelectorRcpp/issues/78#issuecomment-660479257, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABTVVTVXDAAH2ON5ZZDEIM3R4GLVJANCNFSM4H2WISUA .
@MarcinKosinski @krzyslom can you look at the relief branch? There's the current version. I think the api will stay as is (same approach as we're using in information_gain
), so we can release it and then work on the speeding up the internals of relief
.
Any more tests will be welcomed ;)
I might have some time this weekend!
wt., 21 lip 2020 o 08:53 Zygmunt Zawadzki notifications@github.com napisał(a):
@MarcinKosinski https://github.com/MarcinKosinski @krzyslom https://github.com/krzyslom can you look at the relief branch? There's the current version. I think the api will stay as is (same approach as we're using in information_gain), so we can release it and then work on the speeding up the internals of relief.
Any more tests will be welcomed ;)
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/mi2-warsaw/FSelectorRcpp/issues/78#issuecomment-661671534, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABTVVTXSBOH4TFX47UA6E2TR4U3PLANCNFSM4H2WISUA .
Ok then. I'll schedule the release to Monday. @MarcinKosinski any uncaught errors will be your fault xD
Having trouble finding some spot to have a review. Maybe I'll look at that in the evening or tmrw early in the morning.
śr., 29 lip 2020 o 22:01 Zygmunt Zawadzki notifications@github.com napisał(a):
Ok then. I'll schedule the release to Monday. @MarcinKosinski https://github.com/MarcinKosinski any uncaught errors will be your fault xD
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/mi2-warsaw/FSelectorRcpp/issues/78#issuecomment-665883568, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABTVVTUCB5K4QNJ6QAPBRNLR6B5YPANCNFSM4H2WISUA .
Looking now!
niedz., 2 sie 2020 o 12:04 Marcin Kosiński m.p.kosinski@gmail.com napisał(a):
Having trouble finding some spot to have a review. Maybe I'll look at that in the evening or tmrw early in the morning.
śr., 29 lip 2020 o 22:01 Zygmunt Zawadzki notifications@github.com napisał(a):
Ok then. I'll schedule the release to Monday. @MarcinKosinski https://github.com/MarcinKosinski any uncaught errors will be your fault xD
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/mi2-warsaw/FSelectorRcpp/issues/78#issuecomment-665883568, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABTVVTUCB5K4QNJ6QAPBRNLR6B5YPANCNFSM4H2WISUA .
Hey, just created this PR https://github.com/mi2-warsaw/FSelectorRcpp/pull/86/files to check which files were changed. I extended NEWS and added parameters verification. I'm good to merge to have the relief algorithm in the pkg copied from FSelector however I imagine the request here is to get the algorithm in C++ which is beyond my skills
@MarcinKosinski I'll be porting this into cpp gradually;)
Thanks guys!! 🎉 Finally being able to avoid {FSelector}.
@zzawadz Do you have a rough ETA when this goes to CRAN?
Maybe @MarcinKosinski knows?
Looks like Zygmunt is doing some updates right now :)
On Tue, 6 Oct 2020, 17:28 Patrick Schratz, notifications@github.com wrote:
Maybe @MarcinKosinski https://github.com/MarcinKosinski knows?
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/mi2-warsaw/FSelectorRcpp/issues/78#issuecomment-704351397, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABTVVTWZNMCNJV3M3VTS6FTSJMZRTANCNFSM4H2WISUA .
@pat-s it's on CRAN now.
Kudos Zygmunt :)!
On Sat, 10 Oct 2020, 20:20 Zygmunt Zawadzki, notifications@github.com wrote:
@pat-s https://github.com/pat-s it's on CRAN now.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/mi2-warsaw/FSelectorRcpp/issues/78#issuecomment-706590889, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABTVVTRJSE5MZ5UIVA42L2LSKCQXVANCNFSM4H2WISUA .
It would be great if the Relief filter from FSelector is available as a C++ implementation. We would like to use it in mlr and mlr3.
cross-ref https://github.com/mlr-org/mlr3featsel/issues/38