r-gregmisc / gtools

Functions to assist in R programming
25 stars 6 forks source link

Moved baseOf from gplots to gtools #2

Closed smoe closed 3 years ago

smoe commented 4 years ago

Addressing https://github.com/r-gregmisc/gplots/issues/4

smoe commented 4 years ago

For the description I preferred my description since I sense it is a bit simpler. But for a manual page your wording is matching what the function is technically doing and that seems more appropiate, indeed. I have no feelings about it, please steem ahead. I'll craft a manual page tomorrow.

smoe commented 4 years ago

It is done. I had some fun with the examples, admittedly. And whilst thinking about it, I would like to also prepare an inverse function to baseOf - as it is, this feels a bit incomplete. Also, if we think of feature representation, the base does not need to be equal at all positions, so at some point b should be allowed to be a vector. The machine learning community should appreciate that. If I get positive vibes from you about this idea then I would like to create an issue to remind me about it (and to possibly attract additional feedback).

warnes commented 4 years ago

Hi Stefan,

Thanks for the man page. I’ll try to take a look today.

I can see that having the reverse functionality would be helpful, so I’m fine with adding a function (or an argument for the current function perhaps) for that.

As for a handling a vector for the base, are you thinking of allowing conversion to/from the “one-hot” (aka factor encoding) or something else?

-Greg

On Sun, Mar 29, 2020 at 1:47 PM Steffen Möller notifications@github.com wrote:

It is done. I had some fun with the examples, admittedly. And whilst thinking about it, I would like to also prepare an inverse function to baseOf - as it is, this feels a bit incomplete. Also, if we think of feature representation, the base does not need to be equal at all positions, so at some point b should be allowed to be a vector. The machine learning community should appreciate that. If I get positive vibes from you about this idea then I would like to create an issue to remind me about it (and to possibly attract additional feedback).

— You are receiving this because you commented.

Reply to this email directly, view it on GitHub https://github.com/r-gregmisc/gtools/pull/2#issuecomment-605673620, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABO4GX3E7ULF2F5P4SF62DTRJ6CT7ANCNFSM4LDAVCZQ .

-- "Whereas true religion and good morals are the only solid foundations of public liberty and happiness . . . it is hereby earnestly recommended to the several States to take the most effectual measures for the encouragement thereof." Continental Congress, 1778

smoe commented 4 years ago

The "b may be a vector" variant makes sense only for machines, imho. So would need the reverse function to baseOf to get some input. I don't fully understand your question (at least not without googling :o) ), admittedly, but my gut feeling says the answer is "yes". To give a stupid example, you may want to come up with a single number to describe distribution of burns on patients. You have values from 0 to 4 for arms, 0 to 2 for hands, 0 to 3 for heads, ... but only one field to store that value. And if you imagine some better example which has scores that are not representable by single digits (maybe square inchs?) then you see how each body part may be represented by a number in feature space that can be hashed into a single value without information loss ... even though for this stupid example the simple sum of those features may be preferable as a representation for most cases, but maybe not if you want to test for anassociatin that at some point against other clinical phenotypes like depression, for which face is likey more important than arm or leg even though the same number of inch^2 was burt.

But that is just what I see for the future once our paper on the polyominoes is submitted.

warnes commented 4 years ago

It certainly makes sense to delay working on this feature until there is a realistic use case.

On Mon, Mar 30, 2020 at 11:27 AM Steffen Möller notifications@github.com wrote:

The "b may be a vector" variant makes sense only for machines, imho. So would need the reverse function to baseOf to get some input. I don't fully understand your question (at least not without googling :o) ), admittedly, but my gut feeling says the answer is "yes". To give a stupid example, you may want to come up with a single number to describe distribution of burns on patients. You have values from 0 to 4 for arms, 0 to 2 for hands, 0 to 3 for heads, ... but only one field to store that value. And if you imagine some better example which has scores that are not representable by single digits (maybe square inchs?) then you see how each body part may be represented by a number in feature space that can be hashed into a single value without information loss ... even though for this stupid example the simple sum of those features may be preferable as a representation for most cases, but maybe not if you want to test for anassociatin that at some point against other clinical phenotypes like depression, for which face is likey more important than arm or leg even though the same number of inch^2 was burt.

But that is just what I see for the future once our paper on the polyominoes is submitted.

— You are receiving this because you commented.

Reply to this email directly, view it on GitHub https://github.com/r-gregmisc/gtools/pull/2#issuecomment-606068612, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABO4GX4Y2XRH3ECOBUEINODRKC24JANCNFSM4LDAVCZQ .

-- "Whereas true religion and good morals are the only solid foundations of public liberty and happiness . . . it is hereby earnestly recommended to the several States to take the most effectual measures for the encouragement thereof." Continental Congress, 1778

smoe commented 4 years ago

Hi Greg, Is there something that you want me to do on the man page, still? I am asking since I thought of implementing a gray code variant that would need the baseOf in place. Hope you are fine! Steffen

warnes commented 3 years ago

Changes separately integrated into master.