sagemath / sage

Main repository of SageMath
https://www.sagemath.org
Other
1.31k stars 449 forks source link

Add new continuous probability distributions #11572

Open 6619e88d-3b26-4e39-8655-dcd1c168db6d opened 13 years ago

6619e88d-3b26-4e39-8655-dcd1c168db6d commented 13 years ago

There are many continuous probability distributions available on GSL that have not yet been wrapped into Sage. This ticket is for doing this.

Component: statistics

Author: Mato Nagel

Issue created by migration from https://trac.sagemath.org/ticket/11572

6619e88d-3b26-4e39-8655-dcd1c168db6d commented 13 years ago

Attachment: trac_11572_added_distributions.patch.gz

Patch adds a list of probability distributions

6619e88d-3b26-4e39-8655-dcd1c168db6d commented 13 years ago

Description changed:

--- 
+++ 
@@ -1 +1 @@
-There is a list of probability distributions available on GSL that can be wrapped into sage
+There is a list of probability distributions available on GSL that have been wrapped into sage
kcrisman commented 13 years ago

Author: Mato Nagel

kcrisman commented 13 years ago
comment:2

One minor, but important, comment - you seem to have subsumed the patch at #9080 in this patch. You should instead apply that patch to your Sage, and then create your new patch on top of that one. That not only makes it easy on the release manager, but also allows the author of the patch at #9080 his due credit :)

A very minor comment - you should set up your HG defaults so that instead of # User mato.nagel@gmx.net it gives something like Mato Nagel <mato.nagel@gmx.net>, or something similar. There should be something about this in the developer guide.

Also, a point of information related to your email to sage-devel; although you add tests for the new distributions, that wouldn't remove the need for #11514. It would be a good step in that direction, though!


More substantively, I think (my opinion only) that based on the discussion about this on sage-devel, the binomial and other "discrete" distributions should somehow be separated, if only for user convenience. It sounded from your email like this wasn't the final patch anyway, so it's not a big deal yet, but just recording this here for completeness.

Thanks for taking a stab at this! It's certainly sorely needed - as you say, GSL has lots of things for us. (So does Scipy, for that matter.)

6619e88d-3b26-4e39-8655-dcd1c168db6d commented 13 years ago
comment:3

you seem to have subsumed the patch at #9080 in this patch. You should instead apply that patch to your Sage, and then create your new patch on top of that one.

Well, that's exactly what I did. Only I transferred the changes of that patch into my patch by my merge software line by line. It is not the way it is meant to be done, is it? I'm quite unfamiliar with mercurial version control system. So to say I'm unable to rewind that patch. Of course I didn't meant to spoil the credits of patch #9080 developer. As you can see I didn't change his name in the header.

A very minor comment - you should set up your HG defaults

Maybe I can do that :)

Also, a point of information related to your email to sage-devel; although you add tests for the new distributions, that wouldn't remove the need for #11514. It would be a good step in that direction, though!

To be honest I don't know exactly what is meant by that. Please give some example. but you'd better do that at #11514.

More substantively, I think (my opinion only) that based on the discussion about this on sage-devel, the binomial and other "discrete" distributions should somehow be separated, if only for user convenience.

This needs rethinking. Two points to mention:

  1. From a programmers perspective, it is too much redundancy. The implementation of discrete and continuous distribution is almost the same except plotting. Well we can create a base class that implements all functionality except plotting and help. Still I believe its superfluous work.

  2. From a user perspective, especially a newcommer's, it is not likely that he/she knows, for instance, that binomial distribution is a discrete one and that an other class has to be used. So we may create frustration at this end too. Besides R doesn't make the distinction either

6619e88d-3b26-4e39-8655-dcd1c168db6d commented 13 years ago
comment:4

I added two notebook worksheets to test these new probability distributions.

!http://www.sagenb.org/home/pub/2887 !http://www.sagenb.org/home/pub/2886/

kcrisman commented 13 years ago

Description changed:

--- 
+++ 
@@ -1 +1 @@
-There is a list of probability distributions available on GSL that have been wrapped into sage
+There are many continuous probability distributions available on GSL that have not yet been wrapped into Sage.  This ticket is for doing this.
kcrisman commented 13 years ago
comment:5

Replying to @sagetrac-Kamhamea:

you seem to have subsumed the patch at #9080 in this patch. You should instead apply that patch to your Sage, and then create your new patch on top of that one.

Well, that's exactly what I did. Only I transferred the changes of that patch into my patch by my merge software line by line. It is not the way it is meant to be done, is it?

The way I do it is to do

sage: hg_sage.import_patch("The original patch name.patch")

Then do my own changes, then do

sage: hg_sage.diff() # gives me all the changes I've made to look at
sage: hg_sage.ci() # I "commit" the patch
sage: hg_sage.export(tip,"name I want to give the new patch")

This allows the original patch to stay the same. I strongly suggest reading the Sage development guide; it is pretty comprehensive, because we have so many new contributors.

I'm quite unfamiliar with mercurial version control system. So to say I'm unable to rewind that patch. Of course I didn't meant to spoil the credits of patch #9080 developer. As you can see I didn't change his name in the header.

No, of course you didn't. It takes a while to get used to this, for sure. You can do

sage: hg_sage.rollback()
sage: hg_sage.revert(options="--all")

to undo your patch. Then you can reimport the other patch correctly, then start making your changes again.

To be honest I don't know exactly what is meant by that. Please give some example. but you'd better do that at #11514.

Yes, you are right.

More substantively, I think (my opinion only) that based on the discussion about this on sage-devel, the binomial and other "discrete" distributions should somehow be separated, if only for user convenience.

This needs rethinking. Two points to mention:

  1. From a programmers perspective, it is too much redundancy. The implementation of discrete and continuous distribution is almost the same except plotting. Well we can create a base class that implements all functionality except plotting and help. Still I believe its superfluous work.

This is exactly what is done for most Sage modules. We create a base class that does everything that is the same, and then rewrite or overload methods that are different. That would be fine. What I think would be not so good is to try to do the same thing for discrete distributions as for continuous ones, because there should be some differences. Unless you can find a way to preserve backward compatibility while doing R's pattern.

But that could be a different ticket. Better to get this new and useful functionality in, rather than have a long discussion and then a year from now people still can't use all these GSL distributions easily wrapped!

  1. From a user perspective, especially a newcommer's, it is not likely that he/she knows, for instance, that binomial distribution is a discrete one and that an other class has to be used. So we may create frustration at this end too. Besides R doesn't make the distinction either

If someone doesn't know this, they perhaps shouldn't be using probability distributions! More seriously, the best way to deal with this is good documentation at the top of the file, so that it is easy to find in the reference manual.


How about this. Can we make this ticket be about adding new continuous distributions (so not binomial), and then continue the discussion about the rest on a different ticket? I feel like the consensus on sage-devel was in having this mathematically-inspired distinction, but it shouldn't hold up getting in the new distributions. At that location we can also discuss what these things should be named.

6619e88d-3b26-4e39-8655-dcd1c168db6d commented 13 years ago

Attachment: trac_11572_new_patch_on_top_of_fdist.patch.gz