spinjet / pdopt-code

Public Release Version of PDOPT
https://pdopt-code.readthedocs.io/en/main/
MIT License
3 stars 1 forks source link

Review suggestions, part II #4

Closed e-dub closed 8 months ago

e-dub commented 9 months ago

So far impressive stuff! I am starting a new issue for my comments as part of the: JOSS review. Work in progress...

Code

...

Paper

Summary

Statement of need

State of the field

References

spinjet commented 9 months ago

Dear @e-dub Thank you for your comments and suggestions. I will address these as also @jbussemaker had similar feedback for the paper.

Summary

  • [ ] Contemporary... I do not know if only contemporary systems are characterized in this way. What is different now is that we are now trying better control/master/handle/design with this complexity, no?

I agree with your interpretation, perhaps in the search for brevity this meaning is not well communicated. My intended idea was that in the search of better optimality, more and more detail is present in the earlier design phases, hence we have to deal with complex systems more generally, be it an airframe or a consumer automobile. It's true historically complex systems have existed, but were limited to special applications (say Nuclear powerplants, Space systems and so on). So dealing with complexities early on, especially in the aeronautical industry, is a phenomenon of the last 20-30 years. For instance, aircraft design until the 1960s could be initiated by a single chief designer up to the preliminary stage. Nowadays this is almost impossible. I hope this conveys the story I had in mind, I can extend the work possibly.

  • [ ] You are framing the context within the realm of uncertainty in the summary. Do I understand correctly that the surrogate model is able to capture the design space and its uncertainty? Or does the surrogate model work in a probabilistic way? Can you please add a sentence to make this clearer?

The surrogate model captures uncertainty in the sense that it's a Gaussian Process and its being leveraged for calculating the probability a design point falls within the decision boundary of a requirement (i.e. the domain where the requirement is satisfied in the design space). This probability should be interpreted in a Bayesian way: it measures the level of confidence the requirement is satisfied. The advantage of this approach is twofold:

  • [ ] The design of such systems entails high uncertainty due to the large number of parameters defining it. Is the main source of uncertainty due the number of parameters or the assumptions and the abstractions in modelling?

Within PDOPT the source of uncertainty is epistemic: that is the designer does not know precisely which combination of design parameters satisfy the requirements set forward by the problem. Indeed, as explained above, the code leverages the probabilistic exploration to restrict the possible range of design sets to look for by eliminating those where there's an high confidence the requirements are not satisfied. So I reckon the sentence doesn't properly convey the intended concept, I'll correct it. What I had in mind was the number of parameters leave a lot of degrees of freedoms in the design problem hence the epistemic uncertainty from finding the right set of parameters.

Statement of need

  • [ ] Exploration and search phases: maybe search phase better described as exploitation or optimization or dimensioning?

In my PhD thesis I preferred the use of search phase because "optimisation" may be confused with the actual optimisation being carried out within each surviving design set. Since this step searches each set with a local optimisation problem to recover the actual design points, I used the word "search". Exploitation may be a suitable candidate as it's more familiar in the computational engineering and mathematical optimisation fields. Admittedly, in the codebase the word "optimisation" is used for the module carrying out the search phase, so perhaps I should reflect on which nomenclature to use to uniform it.

  • [ ] A full PDOPT test case I think this is not for just test cases.

Apologies, what I meant a full run or PDOPT analysis. I'll correct it immediately.

  • [ ] Thanks to the Exploration phase, the computational cost for design space exploration can be reduced by up to 80% Can this be a bit reworked or detail added. "thanks to exploration, exploration is faster..." does not sound right. Maybe "advanced exploration" or something.
  • [ ] The speedup cited in this work is the filtering method proposed in Spinelli, Anderson, et al (2022)?

Yes, the computational cost saving is for the overall design space exploration. In other words we avoid having to run an optimisation in areas of the design space that would not yield feasible design. This was originally shown as a proof of concept in Spinelli, Anderson et al (2022) as you correctly point out. I see that the sentence structure of Thanks to the Exploration [...] by up to 80% is confusing and may lead the reader to confuse the overall design space exploration task with the "exploration step" of PDOPT. This is going to be reworked.

State of the field

  • [ ] Review checklist requires a sort of "state of the field". Consider adding this explicitly as a new section or implicitly in existing sections to address the point below.
  • [ ] Add how a comment how this problem is typically solved and possibly what software packages. Compare this software to typical methods. Advantages and disadvantages

@jbussemaker also suggested to include this section. I originally thought providing references to other publications may be enough in case a reader is interested in deepen its understanding of this topic, but I agree it's necessary. Perhaps this was missing from the JOSS requirements when I made the submission? I plan to include this section with also a small reference to theory.

References

  • [ ] Consider extending sources beyond self-citations

I agree, this was also pointed out by @jbussemaker. I plan to provide some references to the theory that underlies the concepts PDOPT leverages and similar approaches for design space exploration.

From your questions and suggestions I recognise the paper doesn't provide enough context on the theory and ideas behind PDOPT. I am really thankful for your questions because they elucidated some gaps in the paper which I completely overlooked during the submission. I will work on these and include some figures which help visualise what the framework is doing.

spinjet commented 9 months ago

Added a almost completed section for State of the Field with relevant references.

spinjet commented 9 months ago

Dear @e-dub.

I've made the improvements on the paper that you have addressed. I've added a State of the Field section, introduced more references, and made improvements in the section you discussed.

e-dub commented 8 months ago

Great changes. The paper has really advanced. In rereading it, I have the following questions, critique, etc.

Paper, part II

Documentation

(please repoen this issue)

kyleniemeyer commented 8 months ago

@e-dub @spinjet regarding statement of need in the documentation: unlike in the paper, where we have the required section "Statement of Need", the docs don't have to explicitly have a header titled like that, but there does need to be a clear statement/description of need as described above. The README / docs landing page is often the best place for this.

spinjet commented 8 months ago

Thank you for your feedback @e-dub. I'll address the changes here:

  • "Figure Figure", "Figures Figure", "Fig. Figure"

This has been corrected.

  • From the text, it is not clear if you are dealing with discrete or continuous design variables or both. Is either phase limited by design variable type? Can you mention this.

The framework is designed to handle both types, it seems I forgot to explicitly mention this. It has been added in the first paragraph of the Statement of Need section.

  • Please mention if there are any limitations to responses included in optimization? Convexity, unimodality, continuity, smooth, bifurcated, etc.? Or number of variables or constraints? Could islands of feasibility in the design space be handled?

Feasibility islands are intrinsically handled as the design space is divided into small portions, each one evaluated with the probabilistic exploration phase, and the ones with a sufficiently high probability of having feasible designs are then analysed with the optimiser (search phase). Regarding the behavior of the evaluation function for optimisation, I've chosen to use UNSGA-3 because it's a gradient-free population based algorithm... which should handle even non-smooth functions (wich in case of multi-disciplinary models, is a possibility). Finally, the code has not been tested with large number of objectives and input paramters yet (so far maximum 8 inputs, 3 objectives and 3 constraints), but its limitations are ultimately the computational cost of the model. The surrogate option for the search phase was added to enable handling large number of parameters as the GA would require a lot of functional evaluations in that case. These consideraitons have been now integrated in the 8th paragraph of the Statement of Need section, where the search phase is described.

  • Is there a better way to show "traditional design" versus your proposed set-based method in Figure 1? I do not know if this is accurate (or it is not clear to me)

I re-read the paragraph where the Figure is referenced and it was not sufficiently described there. The distinction between SBD and the classical design approach is in focusing on eliminating unfeasible designs and leaving options open rather than selecting a single "optimal" design point to refine iteratively. SBD is more robust because it allows more margin of movement in case adjustments on the designs are needed (the learning points in the timeline). I've expanded the paragraph to include these aspects, hopefully this makes the image more clear.

  • Great that you have included a plot of the architecture. Can this be done a bit cleaner? The arrows are entering the process from all sides and therefore I find it hard to follow.

The diagram is laid out such that at the centre the main process of the code is shown, with the inputs on the sides. Unfortunatelty this was the best layout I could come with as the image was from a presentation. I could try to emphasize the start and finish and what is the actual architecture and what is inputs with background boxeds.

  • Decision boundary or feasibility boundary?

The term "decision" boundary was used because it's the edge where the set-based elimination process decides if one set is to be kept or not... this is because you may define a desirability constraint from the objectives. These act to restrict the design space to where a minimum level of improvement can be achieved (e.g. minimise F but you want F to be at least less than 100). These are not compulsory and generally useful after a run of PDOPT with just feasibility constraints. Hence why they are named "desirability", since these do not automatically mean the design should be rejected.

  • The difference between "exploration" and "search" is not clear defined in text. Please define and differentiate in text.

This is briefly mentioned in the second paragraph of the Statement of Need section. I've expanded it to be more clear, but the following paragraphs explaining how the exploration phase and search phase work should already indicate the difference between the two steps.

Documentation

  • The reviewer checklist requires a statement of need in documentation. Could you please add this? "A statement of need: Do the authors clearly state what problems the software is designed to solve and who the target audience is?" I find this related to the questions I have above about the software's usage and limitations.

As mentioned by @kyleniemeyer, there is no need for a direct section but I have extended the description in the introduction page of the online documentation.

I've pushed the changes for the paper, please let me know if you need anything else.