Closed damar-wicaksono closed 1 month ago
Thank you very much for the useful comments and suggestions. The responses and changes made to the code are listed here.
The changes have been provided with pull request #9.
The modifications these comments require will be addressed in the pull request, following issue #8. Since it is more related to the documentation, we decided to move the changes to the other branch in order to have an overall review of the tutorials.
computeLebesgue
has been split in order to execute it even after the training of the GEIM algorithm on separate codes, avoiding re-training the GEIM algorithm. This can be useful when different Lebesgue constants have to be compared.synt_test_error
in the online phase
namedtuple
, as suggested, for errors and computational time to help the access to relevant quantities, structured as follows:
Results = namedtuple('Results', ['mean_abs_err', 'mean_rel_err', 'computational_time'])
The only exception is the PE
class which includes also the estimated parameters, resulting from the optimization process.
synt_
stands for synthetic, since the measurements are coming from simulation data (not actually collected on an experimental facility). For POD-related algorithms, it's used to highlight the fact that there is a comparison with simulation data, as well. GEIM, TR-GEIM and PBDW also include a real_reconstruct
method to reconstruct the field from real data (given as input as arrays).
geim
class (same for pbdw
) both in the subpackage offline and online. I am aware that this can create confusion, however they are not meant to be executed in the same script for real applications, as a consequence of the offline/online paradigm. We would prefer to keep this terminology to avoid importing like from pyforce.online.geim import geimOnline
which we think can be a bit repetitive. I am adding some figures in the docs to help the user show how the classes are connected, as you suggested.gaussian_sensors
to GaussianSensors
, import_H5
to ImportH5
, POD_project
to PODproject
are modified.pyforce.offline.sensors
is a class required to generate the Riesz representation of a set of sensors, either in L2 or H1, necessary by GEIM and PBDW. As already mentioned, the connection between the different classes is being added to the documentation and they should provide a clear overview of the package functionalities (for instance, the basis functions from either POD or GEIM can be mixed together with GEIM sensors in the PBDW class; other combinations are not suggested).
For what concerns the tutorials, they are under update in the issue #8 to help external users understand the package capabilities.Let us know if some changes or responses are not clear, thank you again :)
Thank you for going through my comments!
Here are my replies:
The modifications these comments require will be addressed in the pull request, following issue https://github.com/ERMETE-Lab/ROSE-pyforce/issues/8.
Thank you!
The function computeLebesgue has been split in order to execute it even after the training of the GEIM algorithm on separate codes, avoiding re-training the GEIM algorithm. This can be useful when different Lebesgue constants have to be compared.
Okay!
The output of each method has been changed to namedtuple, as suggested, for errors and computational time to help the access to relevant quantities
Thanks!
synt_ stands for synthetic, since the measurements are coming from simulation data (not actually collected on an experimental facility).
Thank you for the explanation. I think it would be nice to put your explanation (or a version of it) in the codebase as in-code documentation at a point(s) these terms appear prominently or in the documentation itself regarding the difference between these two terminologies ("synthetic measurements" vs "real measurements").
We would prefer to keep this terminology to avoid importing like from pyforce.online.geim import geimOnline which we think can be a bit repetitive. I am adding some figures in the docs to help the user show how the classes are connected, as you suggested.
Understood!
Sorry about the typo about GramSchmidt: it has been fixed.
Great!
An overall check on the classes has been performed to be consistent with the camel case convention.
Thank you!
Schemes of the different classes and algorithms for the offline and online phase have been added to the documentation (link to images and development branch) and the read me:
I think the diagrams for classes and the required data and how they relate to each other would be a nice addition to the documentation. Thanks!
The POD, GEIM and PBDW classes have been implemented in different time periods and to stay simple no parent/abstract class was coded; even though it is true they do share a lot of similarities, especially in the online phase. Discussing with @Neko-tan, we think it will be for sure one of the main enhancements of future releases of the package and it is planned in the future.
I understand. The current structure is not a breaking point but it would be worthwhile to revisit it in the future. Thank you.
pyforce.offline.sensors is a class required to generate the Riesz representation of a set of sensors, either in L2 or H1, necessary by GEIM and PBDW. As already mentioned, the connection between the different classes is being added to the documentation and they should provide a clear overview of the package functionalities (for instance, the basis functions from either POD or GEIM can be mixed together with GEIM sensors in the PBDW class; other combinations are not suggested).
Yes, I think the documentation update on this would certainly be helpful.
I think you've addressed all my comments and I have no further points to raise in this area, except for a minor point above regarding updating the documentation a bit.
As the remaining points are deferred to the documentation, I will have a look into it separately.
I'm closing this issue now...
This issue is related to the JOSS submission review.
The tutorials provided serve as a basis for reviewing the functionality of the software. Overall, the tutorials can be executed successfully with additional remarks that I detail below.
Installation
The requirements based on the
environment.yml
file allows the package to be installed successfully viaconda
in MacOS 14.5 (M2). However, when creating series of plots in the tutorials the following packages are missing:trame
ipywidgets
openpyxl
trame-vtk
[x] Please double check if these packages are indeed necessary (at least to follow the tutorials) and include them in the requirements (perhaps optionally).
[x] In MacOS 14.5, the statement
pv.start_xvfb()
raises exception aboutlibgl1-mesa-glx xvfb not available
. However, removing this statement will allow the plots to be generated.[x] It appears (also in MacOS 14.5), some plotting causes a complain that
vue
must bevue2
instead ofvue3
; this may be related to the installed version oftrame
.[x] The dependencies version listed in the
environment.yml
are specified exactly (i.e., specified up to thePATCH
version) which I think may lock the versions required and can be restrictive. Is this intentional?Performance
As I don't see any performance claim regarding the package, I believe this criterion is irrelevant.
Functionality
from pyforce.offline.geim import computeLebesgue
as follows:Both
magic_fun
andmagic_sens
are attributes ofGEIM
instances. Is there a particular reason not to implement this function as a method of the class (GEIM.compute_lebesgue()
orGEIM.lebesgue()
?[x] I find that the tutorials are a bit cluttered with complicated plotting routines and they often get in the way of presenting the (expected) results. One straightforward way to declutter the tutorials is by extracting these routines into local modules that are only relevant for the tutorials. Alternatively, if some of the plotting routines are general in nature, they can be implemented as a high-level plotting functions directly in the code base (but only if they are general).
[x] According to the tutorials, the method
synt_test_error()
is often called with respect to instances of classes in the online phase. The return value is a tuple but each implementation of the method have different length and for quantity that means the same the position may not be consistent (e.g.,computational_time
often appears as the last element of the tuple except for instances of thePE()
class).In my opinion tuple of length beyond 2 (absolute and relative errors) is already difficult to intuit which index signifies which quantity. Perhaps it would be better to store the output of the method as a dictionary or a named tuple whose relevant quantities can semantically be better accessed.
PS: What does
synt_
stand for?GEIM
andPBDW
have exactly the same name in two different modulesoffline.geim
(resp.offline.pbdw
) andonline.geim
(resp.online.pbdw
). While this is definitely not an issue in Python, I'd argue that they can be confusing for high-level classes. We must know from where which is imported and in case we need both on, say, the same script, we have to rename one of them on the fly (viaas
). I'd propose renaming the classes such that the name indicates whether the class is related to the offline or online phase (likePOD
andPODI
)Other remarks
For further discussion...
POD
has a method calledGrahmSchmidt
, isn't this supposed to beGramSchmidt
? (referring to Gram-Schmidt orthogonalization procedure?)POD_project()
is not exactly following the convention.Perhaps, this comment is more suitable for the documentation part but I'll put it here anyways...
The package provides the following methods/techniques:
Following the offline/online workflow, the package consists of two main sub-packages:
pyforce.offline
andpyforce.online
. Looking at the features provided above, the following classes represent each technique:pyforce.offline.pod.POD
implements POD in the offline phase (pyforce.offline.pod.DiscretePOD
does not appear in the tutorial, I believe)pyforce.online.pod_interpolation.PODI
implements POD in the online phase via interpolationpyforce.online.pod_projection.POD_project
implements POD in the online phase via projectionpyforce.offline.geim.GEIM
implements GEIM in the offline phasepyforce.online.geim_synthetic.GEIM
implements GEIM in the online phase (without Tikhonov's regularization)pyforce.online.tr_geim_synthetic.TRGEIM
implements GEIM with Tikhonov's regularization in the online phasepyforce.offline.pbdw.PBDW
implements PBDW in the offline phasepyforce.online.pbdw_synthetic.PBDW
implements PBDW in the online phasepyforce.offline.sensors.SGREEDY
(exemplified in the tutorial)pyforce.offline.sensors.gaussian_sensors
(perhaps not a main feature)pyforce.online.indirect_recon.PE
(exemplified in the tutorial)It seems to me POD, GEIM, PBDW share many commonalities (the details differ, though) and I would assume that these classes serve the same overall purpose (again with their own particularities):
V
andnorm
(note that inPBDW
it is callednorms
with ans
) as attributes. Furthermore, the online counterparts all have the methodsynt_test_error()
.pyforce.offline.sensors
in the context ofPOD
,GEIM
, andPBDW
. Currently the documentation lacks the intermediate-level picture of data flow between the phases that can be translated directly to the code (the Figure on DDROM is too high-level and not readily translatable to the usage workflow and code structure). In the tutorials, numerous quantities are saved and loaded in between notebooks which makes it a bit hard to get the bigger picture of the package capabilities; that is, the functionalities provided by the package via its classes get a bit lost in things being saved and plotted.