Improved docstrings with units

fib-international / structuralcodes

A Python library for structural engineering calculations

https://fib-international.github.io/structuralcodes/

Apache License 2.0

81 stars 22 forks source link

Improved docstrings with units #139

Open DanielGMorenaFhecor opened 2 months ago

DanielGMorenaFhecor commented 2 months ago

Some descriptions and results lack information about units. For instance, the GrossProperties dataclass returned/used in the GenericSection class does not include unit documentation.

talledodiego commented 2 months ago

You are absolutely right. I did not included any units because the units should be a user responsibility. If you input the section in mm and material properties in MPa the units of area will be mm2, inertia will be in mm4 etc. But if you use m for instance the area will be in m2.

It is true that we should decide how to approach the problem of units in the whole package in the sense that the code equations sometimes assume some units.

There are basically three options I think:

the package is developed with constrained units decided by us. In this case we could improve all docstrings saying that all length should be in a certain unit (cm or mm for sintance) and same for all other quantities.
the package is basically unitless and it is a user responsibility the input of consistent units for the computations being meaningful (e.g. OpenSees works in this way)
we include some unit-management system and deal with units in all the package.

I tend to not like option 1 because it is un-necessarily over-constraining. For instance in the whole section strength computation you don't need specific units and the user can use kN, m, kPa or N, mm, MPa obtaining the correct moment strength response in kNm or Nmm in the first or second case.

Approach 2 is in principle the easiest one from development side because we move responsibility entirely to user, even if users may struggle with being consistent with units. Furthermore sometimes happens that some code equations are unit-dependent and therefore we should add something to deal with those cases.

Approach 3 is probably too much since in that case we should manage each single input number with its own unit dealing not anymore with plain float numbers but with more complex objects?

What is your opinion on this?

DanielGMorenaFhecor commented 2 months ago

Thank you for sharing your thoughts. Let me expose my opinion on these 3 approaches:

Approach 1: While it may be seen as over-constraining, it offers clarity and consistency, which is beneficial. The straightforward nature of this approach could reduce potential errors and make the package easier to use.
Approach 2: Although it provides flexibility, it might lead to confusion about the units of the returned values, especially in cases where specific units are assumed in code equations. This could result in more support requests and user errors.
Approach 3: As you mentioned, while tools like Pint could help manage units, introducing this complexity may not be necessary and could complicate the development and use of the package.

Given these considerations, I lean towards Approach 1. It’s direct and clear, ensuring users have no doubts about which units to use throughout the package. Additionally, including utility methods for unit conversion (i.e. to_MPa(value: float, unit: Literal['Pa', 'ksi',...]) -> float), could further enhance usability without adding significant complexity.

@MestreCarlos what do you think about this?

MestreCarlos commented 2 months ago

I also prefer approach 1, even though it is more rigid for the user. I think this for two reasons. First, it is easier for the user not to have to figure out how to enter data and interpret the results. Also, I think that in the future the section module will be increasingly related to package level 0 which contains the formulation of the codes with defined units. I believe that the user will more easily find errors due to misinterpretations of units if we do not fix all inputs and outputs.

talledodiego commented 2 months ago

We discussed some more about this in #105. @mortenengen perhaps we can merge the discussions somehow?

I like more those approaches respect approach 1 that I find too rigid.

Let's suppose a user want to use the section verification as a post processing for OpenSees (or even as a hinge-model to input a hinge in OpenSees). In OpenSees he used N, mm, s (so stresses in MPa). Here we force to use daN, cm, s (so stresses in daN/cm2 like old-school engineering often adopted). The user must convert stuff, making this ugly. The other way around if we choose N, m, s and in OpenSees he is using kN, m, s... We shouldn't forget that there are users that work in imperial units also. If we seek some day to try to implement also ACI we should take this in mind.

Maybe permitting the user to set a "set" of base units (like N, mm, s or kN, m, s or ksi, in, s or whatever) is a possibility, assuming then that all measures are given in the correct unit for what is decided.

Anyhow we could introduce a submodule of utilities offering functions for doing some conversions as suggested by @DanielGMorenaFhecor. In this case for instance the user can select as global base units (we can also propose a default one) of daN, cm, s but then maybe he can use convert_stress(12,'MPa') that convert the stress from the literal MPa to the current global unit (in this case would be daN/cm2).

DanielGMorenaFhecor commented 2 months ago

I understand your perspective on offering flexibility to the end-user, as it makes a lot of sense. I apologize if I came across as too forceful in this or other requests. My intention is simply to find the best solution for the end user and to highlight some issues we encountered during the development of the examples.

What we've been discussing in the office is how intuitive it is to work with consistent units. For some formulas or expressions, this might be straightforward, but I don't think this applies universally. Additionally, some users might not have a clear sense of the 'order of magnitude' for certain expressions, which could lead to doubtful results and potentially undermine trust in the outcomes. As you may have guessed, we haven't used OpenSees before, so we're not entirely familiar with how intuitive this approach would be within that framework. These are just some thoughts we're considering.

The idea of using a global set of units that can be displayed or modified by the end-user seems like a good, customizable solution. However, I'm also curious about how this would be implemented. Some possibilities could include:

Implementing a singleton UnitManager class.
Relying on global variables.
As you suggested, using a third-party package.

I feel that using singletons and globals could be tricky, as there would be many references to these variables throughout the codebase. Tracking changes in this global unit configuration could also be challenging, especially if we've already computed or cached values in our calculations.

Another question is to what extent we should implement this. Should it be applied only to the superclasses like Material or Section, or should it extend to all the code formulas (which could be a daunting task)?

This is an important topic that may require a more in-depth discussion, though it might not be essential for the first release.

Thanks for your response!

talledodiego commented 1 month ago

Thanks for the detailed explanation and sorry for the delay of my answer.

I agree with you that this topic should be discussed more in-depth.

mortenengen commented 4 weeks ago

Thanks for bringing up this discussion @MestreCarlos, @DanielGMorenaFhecor, and @talledodiego. I am sorry to be late to the table.

In #105 we identified three options, one option referred to as point 1 in #105 where we put the responsibility on the user to input units that fit with the units described in the docstrings of our functions and methods, and two options referred to as points 2 and 3 where we implement some global system for handling units.

I agree with the opinions above, where you state that a global state object is not preferable. And trying to implement an active measure for handling units might also lead to complicated implementation of code equations, and lack of transparency.

I would lean towards a slightly more liberal (I think?) version of approach 1 above: 1) We implement code equations with units as given in the design code, and explicitly describe the units of arguments and returns in the docstrings. In functions where the equation is actually unitless, e.g. where an area is divided by an area, or a stress is divided by a stress, we should not write units in the docstrings. 2) On higher levels, e.g. in sections, we use consistent units, and let it be up to the user to handle.

This requires that we write clear docstrings, and for sure we have room for improvements for the moment.

With this approach, we implement functions that are one-to-one with the design codes, which is a benefit for the user. And we maintain a flexibility in the section classes allowing for example the user to create the geometry using m and add a constitutive law with stresses in kPa and getting the response in kN and kNm. And we leave the responsibility to the user to convert units where needed, e.g. where a code equation requires MPa for input, and returns MPa, but the user needs kPa in further calculations.

What are your thoughts on this?