Axect / puruspe

PURe RUSt SPEcial library
Apache License 2.0
15 stars 6 forks source link

Develop puruspe v0.3.0: Integrate lambert_w and Implement Comprehensive Tests #12

Closed Axect closed 4 weeks ago

Axect commented 3 months ago

puruspe v0.3.0 Development

This issue tracks the development of puruspe v0.3.0, which will focus on integrating the lambert_w crate, implementing comprehensive tests, and restructuring the project for better organization.

Objectives:

  1. Integrate lambert_w crate:

    • Add lambert_w as a dependency in Cargo.toml
    • Import the four key functions from lambert_w into puruspe
  2. Implement comprehensive tests:

    • Develop a testing strategy for all functions in puruspe
    • Write unit tests for existing and newly imported functions
    • Implement integration tests where appropriate
  3. Restructure project:

    • Split functions from lib.rs into separate files based on function types
    • Improve code organization and navigation
  4. Update documentation:

    • Add usage examples
    • Update README and other relevant documentation to reflect new structure
  5. Ensure compatibility:

    • Verify that changes don't break existing puruspe functionality

Tasks:

Discussion:

Please use this issue to discuss any challenges, suggestions, or questions related to the v0.3.0 development. We welcome all contributors to participate in this discussion.

Recent suggestions from @ethanbarry and @JSorngard have highlighted the need for better code organization. We'll be implementing these changes as part of this version update.

/cc @JSorngard @ethanbarry

JSorngard commented 3 months ago

Nice! I will start on the integration parts today!

JSorngard commented 3 months ago

For testing I suggest adding

#[cfg(test)]
mod tests;

to lib.rs, and then in the file src/tests.rs have all the test functions:

use super::*;

#[test]
fn check_gamma() {
    assert_eq!(gamma(6.0), 120.0);
    ...
}
...

In lambert_w I use the approx crate as a dev-dependency to be able to test calculations that only produce approximate output.

Axect commented 3 months ago

Thank you for the great suggestion on testing structure. I completely agree with your approach. Adding the approx crate as a dev-dependency is also a good idea for handling approximate outputs in our tests.

Regarding how to best test special functions, I've been considering using data tables generated from standard tools like scipy or cmath. We could create tables with values calculated at various random points and then compare our implementation against these values. This approach would allow us to cover a wide range of inputs and ensure our calculations are accurate across different ranges.

Do you have any thoughts on this testing strategy for special functions? Or perhaps you have experience with other effective methods? I'd be very interested to hear your ideas or any best practices you've encountered in this area.

Also, if you have any specific points in the domain of these functions that are known to be particularly challenging or important to test accurately, please let me know. We could ensure our test suite covers these critical areas.

JSorngard commented 3 months ago

Regarding how to best test special functions, I've been considering using data tables generated from standard tools like scipy or cmath. We could create tables with values calculated at various random points and then compare our implementation against these values. This approach would allow us to cover a wide range of inputs and ensure our calculations are accurate across different ranges.

I think this sounds good. Make a table of the input and the correct output, loop over all values and test. I used wolframalpha.com for determining the correct values of the Lambert W function, but I see that it exists in scipy, which should make it easier to quickly get values.

Also, if you have any specific points in the domain of these functions that are known to be particularly challenging or important to test accurately, please let me know. We could ensure our test suite covers these critical areas.

I am not so knowledgeable here I think. For the Lambert functions I found that even though the output was correct to many bits, the error in the least significant bits result in a value that is very close to the correct value when it's small, but the difference grows when the correct value is large. So the tests I have written are less stringent at higher output values.

ethanbarry commented 3 months ago

Hey! I'd love to help with this as well. I'm concerned about how long lib.rs is going to become, though. We could split it up into submodules, but do pub use to export the same structure.

This way, each type of function lives in its own file, and their specific tests will be right below them...

JSorngard commented 3 months ago

That sounds good. Easier to navigate.

How about functions of the same type share a file, e.g. all gamma functions in one, all Bessel functions in one etc? Then there's not too many files either.

ethanbarry commented 3 months ago

That sounds perfect! Should we try to categorize them & split lib.rs before making too many other changes?

JSorngard commented 3 months ago

Probably, yeah. Should probably also wait on @Axect to give their opinion on it first as well I think.

ethanbarry commented 3 months ago

Yeah, I think we have an even spread of time zones here!

JSorngard commented 3 months ago

Work around the clock!

Axect commented 3 months ago

@ethanbarry @JSorngard

Thank you both for your patience. Although it's not quite bedtime in my time zone, I was delayed due to a group dinner.

Initially, Puruspe was designed with simplicity in mind, using a single lib.rs file as we had a limited number of functions. However, thanks to your excellent contributions, we've seen a significant increase in both the variety of functions and the overall code length. This growth has indeed highlighted the need for a more structured approach.

I completely agree with both of your suggestions:

  1. Splitting the code into separate files
  2. Organizing functions by type (e.g., grouping all gamma function-related functions)

These changes would undoubtedly improve navigation and facilitate future implementations.

I believe it would be beneficial to incorporate this restructuring into the current issue. I'll update the issue to reflect these plans for file separation and reorganization.

JSorngard commented 3 months ago

Oh also, we are not changing the API in any way right? So I think it could be puruspe v0.2.6.

ethanbarry commented 3 months ago

I moved the test (previously in src/bin) for Dawson's integral to the new dawson file y'all created. Not sure if that's how @Axect would like the tests arranged—I'll edit it if there's a preferred way.

17

Axect commented 3 months ago

Oh also, we are not changing the API in any way right? So I think it could be puruspe v0.2.6.

Thank you for your input. You make a valid point that since the user-facing API hasn't changed, version 0.2.6 could be appropriate.

However, I believe version 0.3.0 is more suitable for the following reason: Up until now, Puruspe has proudly maintained a "no dependency" stance. While we're only adding a pure Rust dependency, it still marks a significant shift from our previous approach. This change, even if not visible in the API, represents a fundamental difference in the library's architecture and philosophy.

Given this notable shift, I think it's appropriate to reflect this change with a minor version bump to 0.3.0. This version number would better communicate to our users that while the API remains stable, there's been a meaningful change under the hood.

What are your thoughts on this reasoning? I'm open to further discussion on this matter.

JSorngard commented 3 months ago

Aah, yes I did not think of that. That is a very good reason.

Axect commented 2 months ago

@JSorngard @ethanbarry

Hello team! Thanks to our collaborative efforts, we've made significant progress on puruspe v0.3.0. I'd like to take a moment to review what we've accomplished together and see if there's anything else we should consider before finalizing this version.

  1. README Update: We've updated the README to reflect the new changes and project structure. Could we do a final review to ensure it effectively communicates the library's capabilities and usage?

  2. New Features: Are there any additional features or improvements you think we should include in this version?

  3. Testing: Thanks to your contributions, we now have comprehensive tests for both new and existing functions. Is there any area you feel might benefit from additional coverage or refinement?

  4. Overall Structure: We've restructured the project with separate files for different function types. How do you feel about this new organization? Does it improve code navigation and maintainability from your perspective?

  5. Documentation: We've updated inline documentation. Is there any part of the documentation that you think needs more clarity or detail?

Your insights have been invaluable throughout this process, and I'm grateful for your contributions. Please share any additional thoughts or suggestions you may have. Your continued input is crucial in ensuring that this release meets our collective quality standards and user expectations.

Thank you both for your hard work and collaboration on this project!

ethanbarry commented 2 months ago

It's been a pleasure! The semester has started back up, so I will be less active, but I still hope to contribute. I'll stay tuned and watch the issues.

JSorngard commented 2 months ago

Yeah, I've had fun! I will take a look through everything in the repo in an hour or so and add a comment here with anything I find.

JSorngard commented 2 months ago

I found two things I think we may want to improve before publication:

  1. I may just be bad at looking, but I am unable to find any tests for besselik, besseljy, cached_Inu_Knu, cached_Jnu_Ynu, cached_besselik, cached_besseljy, Inu_Knu, and Jnu_Ynu.
  2. There are some functions that can panic, but this information is not currently present in their documentation, e.g. besseljy. We could add a "# Panics" section to their documentation and describe when they can panic there.

And a third thing that may be worth thinking about:
Currently a user of the cached_* functions can pass in any HashMap with incorrect values they have added themselves and thus get the wrong results. It could be useful to make a newtype like for example struct BesselikCache(HashMap<(u64, u64), (f64, f64, f64, f64)>) and make the function take in that, and then not expose the ability to insert values into the map to the user.

Beyond that, everything looks great!

Axect commented 2 months ago

@JSorngard

Thank you for your thorough review and valuable feedback! Your observations are spot-on, and I appreciate you bringing these issues to our attention. Let's address each point:

  1. Missing Tests: You're absolutely right. I've confirmed that we're missing tests for besselik, besseljy, cached_Inu_Knu, cached_Jnu_Ynu, cached_besselik, cached_besseljy, Inu_Knu, and Jnu_Ynu. This is an oversight on our part, and I'll make sure to add comprehensive tests for these functions as soon as possible.

  2. Panic Documentation: Your suggestion to add a "# Panics" section to the documentation of functions that can panic is excellent. This is indeed important information for users, and we should definitely include it. I'll go through the codebase and add this section where appropriate, starting with besseljy and any other functions that have potential panic conditions.

  3. Cached Functions Safety: Your point about the potential misuse of cached_* functions is well-taken. I agree that creating a newtype (like BesselikCache) and not exposing the ability to insert values directly is a much safer approach. It's a great idea to encapsulate this functionality and prevent potential errors from incorrect manual insertions. I propose we implement this change by:

    • Making the current cached_* implementations private.
    • Creating high-level structs (like BesselikCache) for each cached function.
    • Implementing methods on these structs to handle the caching logic safely.

These are all valuable improvements that will enhance the robustness and user-friendliness of our library. I'll start working on these changes right away.

Is there anything else you think we should consider before moving forward with these improvements? Once we implement these changes, we'll be in a much better position to publish version 0.3.0.

Thank you again for your insightful feedback.

JSorngard commented 2 months ago

I can work on the cached_* functions :D

The only other thing I have noticed is that some functions (e.g. Jn) take a usize as argument, but they do not use it for indexing. This means that the range of possible values they can take as input is different between different architectures. It might be worth it to consider changing the input type to an unsigned integer with a fixed size, like u32 or u64 depending on the maximum argument value that the implementation can handle.

Axect commented 2 months ago

Thanks for your ongoing input and for offering to work on the cached_* functions!

Regarding the usize usage:

  1. You're absolutely right. For consistency with our f64 targeting, we'll change these to u64.
  2. We'll update all relevant functions (Jn, Yn, In, Kn, etc.), their documentation, and tests.

This change will ensure consistent behavior across different architectures.

I'll start on these usize to u64 changes while you focus on the cached_* functions.

Again, thanks for your valuable insights!

JSorngard commented 2 months ago

Nice! It may (I am unsure about the correct tradeoffs for this crate here) be better to use u32 since it can be converted into a f64 losslessly for all values (there's a From<u32> for f64 impl), but converting a u64 into a f64 will result in approximation for some values. This could be fine though as the first integer for which that happens will be of the order 9e15, which we may not expect a user to actually input.

Using u32 would unfortunately stop a user from inputting values that are larger than 2^32-1.

Axect commented 2 months ago

I agree. Now, I changed types of order variable for all integer order Bessel functions to u32.

Axect commented 1 month ago

Hi @JSorngard and @ethanbarry,

We've made significant progress on the objectives for puruspe v0.3.0, and we're now approaching the publication stage. Before we proceed with the release, I wanted to check in with both of you:

  1. Do you have any final thoughts, suggestions, or concerns regarding the changes we've implemented?
  2. Is there anything you feel we should address or review before publishing v0.3.0?
  3. Are you satisfied with the current state of the documentation and test coverage?

Your input has been invaluable throughout this process, and I want to ensure we haven't overlooked anything important. Please share any final feedback you might have.

If everything looks good from your perspective, we'll move forward with publishing v0.3.0. Thank you both for your contributions to this release!

JSorngard commented 1 month ago

I think everything looks good!

This was very fun to contribute to :D

JSorngard commented 1 month ago

Oh one thing: in the README there's a section called "Project Structure" that I think we may want to rework or remove. A user of the crate might think that those are modules that are exposed in the API and they should import from them when they are not, and someone who intends to contribute to the crate will see the file structure immediately when they look at the repo on github anyway. The test location of the Lambert W, gammap, and dawson functions could also be added to the "Precision" section.

Axect commented 1 month ago

Thank you for your valuable feedback. I agree with both of your suggestions.

I've removed the "Project Structure" section from the README as you recommended. I've also added information about the test locations for the Lambert W, gammap, and dawson functions to the "Precision" section.

If there are no further comments or suggestions, I plan to publish the release tomorrow (October 17th) at approximately 1:00 PM KST (4:00 AM UTC).

Let me know if you have any other thoughts or if there's anything else you'd like me to address before the release.

Axect commented 4 weeks ago

Puruspe 0.3.0 has been successfully published!

Thank you to everyone who contributed feedback and suggestions.

You can find the new release on crates.io: https://crates.io/crates/puruspe

If you encounter any issues or have any further suggestions for improvement, please don't hesitate to open a new issue. Your feedback is always appreciated as we continue to enhance Puruspe.

This issue can now be closed. Thank you all for your contributions!