kanors-emr / Veda2.0-Installation

Veda2.0 is a data handling system for The Integrated MARKAL-EFOM System (TIMES) - a bottom-up optimization model for energy-environment systems
https://www.kanors-emr.org/
3 stars 0 forks source link

User defined commodity groups are returned by Cset_CN field #21

Open olejandro opened 9 months ago

olejandro commented 9 months ago

It seems that user-defined CGs are currently returned by Cset_CN field. Should they not only be retrievable by Other_Indexes?

Antti-L commented 9 months ago

Comment: Any commodities matching CSet_CN must of course be allowed. User-defined CGs can be commodities, and when so, must be allowed matching by Cset_CN. I guess items not being commodities might be supported by VEDA for user convenience, only for those parameters that are expecting a single CG (which would very often be a commodity, and therefore users would expect CSet_CN to be supported anyway, and it would be awkward if using two different fields would be forced for the same index).

If it has been like that for years in VEDA, and users have been happy until now, I think it would be a bit harsh to break it now: Many models might stop working. You could always use Top_Check where necessary to limit the CGs to commodities in the topology. I guess Other_Indexes does not even support wildcards, which may be another reason to support CSet_CN.

olejandro commented 9 months ago

@Antti-L could you please clarify what you mean by "User-defined CGs can be commodities"? I guess an example utilising the appropriate TIMES GAMS sets would be easiest to understand.

Antti-L commented 9 months ago

Well, I just wanted to point out again that the set of commodities is a sub-set of the set of CGs, and so if an item X is a member of the set Com_Grp, it can also be a member of the set COM. Likewise, if an item is a user-defined CGs, such an item can also be a member of the set COM. Basically all commodities are user-defined, and so for a user-defined CG with the label X to be a commodity, the user would have to specifically define that item X as a commodity, and such is allowed both in TIMES and VEDA. However, TIMES does not allow commodities that have user-defined members to be in the topology. But there are indeed a few use cases for such non-topology group commodities, e.g. in COM_PKRSV / COM_PKFLX, and "advanced" uses of FLO_SHAR, where one can use a group commodity instead of an elementary commodity, to get some added functionality.

olejandro commented 9 months ago

Thanks @Antti-L. So basically a user-defined CG, let's say called RSD_SH, that includes 3 commodities, let's say called RSD_SH-Apt, RSD_SH-Att, and RSD_SH-Det, can also be a member of the set COM? However, a commodity flow that includes RSD_SH will be invalid?

Antti-L commented 9 months ago

commodity flow that includes RSD_SH will be invalid?

Not sure I fully understand 'includes' , but a commodity flow in RSD_SH (i.e. a process flow in RSD_SH, such that RSD_SH is in the process topology), or a commodity balance in RSD_SH, is invalid.

olejandro commented 9 months ago

Well, a commodity flow consists of a process and a commodity, right? Ok, so based on our conversation, I believe that indeed the behaviour I observed is a bug, because there was no commodity declared of the same name as the user-defined CG.

What do you think @akanudia?

Antti-L commented 9 months ago

So, do you mean that you would like that "fixed", such that many models would no longer work? :smile: I mean that I assume here that we are talking about attributes accepting any CG, not just a commodity, and therefore VEDA accepts specifying CGs. But of course, if you mean attributes accepting only a commodity, then I agree it would be a bug. Can you confirm this, and mention which attribute appears to accept any CG for a commodity index?

olejandro commented 9 months ago

Sounds very dramatic / traumatic @Antti-L 😅

Judging from your last comment and the questions, I suspect you misunderstood the issue / point I'm trying to make.

I believe it is a bug that a query on commodities returns user-defined commodity groups not defined as commodities in addition to commodities. I'm not saying anything regarding how that list of commodities should be processed.

If any model is relying on a user-defined CG to be returned by the query, an easy fix would be just to define it as commodity in system settings.

Antti-L commented 9 months ago

It would help if you would make clear whether you are talking about a commodity index or a CG index. If the index is a CG index, I think you cannot simply assume that there is a query on commodities only, when the attribute supports any CGs for the index in question. Wouldn't it then be clearly by design and useful? Anyway, I think you should also make clear whether or not you are requesting a change that would break backwards compatibility.

olejandro commented 9 months ago

I'd suggest to wait for any comments from the developers before proceeding further with the discussion. 😊

akanudia commented 9 months ago

Well, Antti had given a comprehensive list of benefits of this behaviour and also why it would be undesirable to change it.

Veda has evolved over two decades and it has several exceptions of this nature. What I am curious about is that how is this issue hurting you as a Veda user? Veda is just letting you avoid doing something that you would have to otherwise - "an easy fix would be just to define it as commodity in system settings.".

I would understand if you wanted me to highlight this feature in the documentation so that users can take advantage of it, but I don't understand why you would want to downgrade an application to make it conform to simpler rules.

olejandro commented 9 months ago

Thanks @akanudia. Well, I guess, the main reason for me to open issues and engage in the discussion here is to get consistent, predictable and easy to understand behaviour from the software that I use very often. I hope this also results in a better product for you.

I also find it important to distinguish behaviours that seem like a "hack", essentially technical debt, from evolution.

Coming back to the point, there are several column headers, incl. CSet_CN, in TFM tables that collectively specify a list of commodities that is later used to specify indexes of TIMES attributes. Judging from the documentation and the field names the list should only include commodities, however it seems to include all (?) commodity groups, i.e. also those CGs that are not defined as commodities. I have not tested it, but I suppose Veda generated CGs are included, as well?

My suggestion to fix the inconsistency was just one suggestion (one that I believe allows the user to stay in control and enables Veda to assess whether the user knows what he/she is doing). There are definitely others! One could, for example, change the column name to indicate CGs instead of commodities and update the docs (the original column headers could be treated as aliases to ensure backwards compatibility). It is just important that the solution leads to increased transparency and better user experience.

In my case, for example, there were several tables with the pattern commodity-name* specified in the CSet_CN column. No commodity commodity-name was defined in the model. I later introduced a commodity group commodity-name which led to Veda throwing a number of errors. This wasn't a problem for me, except for a disappointing user experience, but users with less experience may have been at a loss.

Antti-L commented 9 months ago

@olejandro : I just tested with a UDCG, trying to define a few attributes with that UDCG being specified in CSet_CN. I am seeing the UDCG being accepted for attributes expecting a CG, but getting a "Commodity is invalid" error/warning for attributes expecting a commodity. I think both results are as expected. And the error messages in the second cases make it clear that the UDCG is not a valid commodity. The same results occur also with a group-name* specification, where group-name is the name of a user-defined commodity group.

Therefore, I cannot confirm VEDA throwing any errors that would make the user "at a loss". Isn't that "Commodity is invalid" error quite clear here, and should have been issued regardless of how VEDA is designed to otherwise handle UDCGs (because the TIMES attribute expects a valid commodity, which the UDCG was not)?

olejandro commented 9 months ago

@Antti-L I definitely can see how a user can be at a loss. If you follow my example, the error messages appear after defining a CG, without intentionally specifying it anywhere, because it gets picked up by the filter.

Antti-L commented 9 months ago

@olejandro : Thanks for the clarification, I didn't know you didn't create that UDCG intentionally to match those patterns, but you accidentally created an UDCG that was matching the patterns? If so, yes I see that may be a confusing user mistake, but related accidental matching could easily happen with any patterns, whenever you create new items that may match some existing patterns. With that UDCG case you at least get those errors, while in other cases your errors might get unnoticed.

Antti-L commented 9 months ago

@akanudia : Would it be possible to suppress the "Invalid commodity" error message when a wildcard pattern is used? I guess users would most likely prefer such not being issued in that case, but only valid commodities be matched with a pattern mask.

olejandro commented 9 months ago

Well, I don't see it as a user mistake if the expected behaviour of that fille filter is to return a list of commodities, not CGs.

olejandro commented 9 months ago

@akanudia : Would it be possible to suppress the "Invalid commodity" error message when a wildcard pattern is used? I guess users would most likely prefer such not being issued in that case, but only valid commodities be matched with a pattern mask.

Actually this would make things less transparent and, in my view, worse.

akanudia commented 9 months ago

@olejandro : this is on your first post of today. I don't see "..enables Veda to assess whether the user knows what he/she is doing" as a part of Veda's mandate at all. In fact, knowing what you are doing is a prerequisit for using Veda. It will let you work faster and with more transparency if you know what you are doing and if you know what you CAN do.

For example, you can refer to Veda generated CGs as NRGO, NRGI, DEMO etc in any CSET_CN column. Of course, they will work in Other_indexes too :).

I don't understand your suggestion about renaming the columns and aliases etc. From my point of view, users are free to put CGs in only other_indexes col and not in commodity columns.

In your suggested approach, the user will be responsible to remember that FLO_BND and ACT_EFF (and many others) need a commodity group and not commodity. CG is a very technical concept, which is a single commodity in a vast majority of the cases in normal models. I see no need to make it compulsory for users to understand it.

I did not see this as clearly in the beginning, but I think this is one of the important features of Veda that make it easier to use a complex model like TIMES. Just like the ability to use "Share" with a process and commodity, without having to deal with the denominator if the default works.

olejandro commented 9 months ago

That's fine @akanudia. I've opened this issue, because as a user I believed there was a bug in Veda. Based on your and @Antti-L responses, I get an impression that's an intended behaviour.

For multiple reasons I don't think it is a good behaviour, but it is up to you to do anything about it; to make it more user-friendly.

akanudia commented 9 months ago

Thanks, @olejandro, so we agree to disagree.

I will just state the main reason why we see this as an important feature and not a bug: One of the core principles of Veda is to expose users to advanced modeling concepts only when they need them. This is why we have a layer of attributes like EFF, CEFF, Share, ENVACT, which let people think of their systems more intuitively (like other simpler models do) and map these attributes to the more sophisticated TIMES parameters. Topology is managed via attributes as well (like in other simpler models) - explicit specification is not compulsory. In other words, Veda manages_ the complexity of TIMES if you want to do simple modeling.

On the other hand, the same application allows you to set up models with all TIMES sets and parameters declared in their original form - using no Veda defaults.

This flexibility is based on features that may appear like bugs if you think in a certain rule-bound way.

I have seen some users frustrated with Veda simply because they have messy Excel files. This typically happens when different users work on models over the years, and they copy-paste existing structures without utilizing better routes that Veda offers. Veda is so flexible that it is almost like a language. The clarity and beauty of what is written is entirely up to the user.

olejandro commented 8 months ago

Thanks @akanudia. In my experience Veda's "flexibility" generates a lot of confusion among users. Very few are able to work independently with it (i.e. without requiring support from a consultant or using a working example from the forum). I do not think this is good, especially given how simple the principles behind it are.

akanudia commented 8 months ago

@olejandro , can you give a concrete example of how you suggest to limit the flexibility of Veda in order to make it more user friendly?

Like I mentioned earlier, one can always work with one or two ANSWER-like formats in all Excel files. I think the main problem is that these formats have not been highligted anywhere.

akanudia commented 8 months ago

simplified Veda syntax for demo models.xlsx

The first sheet of this file is the Con_ELC sheet of DemoS_003. The second sheet generates the same data, but using a syntax that would be far easier for new users.

Mauri and I had been careful to introduce the different Veda input files and tags progressively, motivating their need when doing so, when we revamped the Demo models a few years ago. I think we need another pass - this time starting with a few standard formats and introducing the optional ones progressively. Maybe the advanced syntax options could be shown in a separate model - with some explanation on how they help in certain situations.

To summarize, users should be introduced to TIMES parameters via simple formats. Veda acrobatics should be a separate section of the training material.

olejandro commented 8 months ago

Thanks, @akanudia. I believe it is a good idea to utilise spiral learning. At the same time, it is also very important to keep things consistent. The use of CSet_CN (i.e. the current issue) is probably a good example of how use of existing structures is changed over time to support TIMES features to a greater extent, making things confusing.

In Software development, it is normal to introduce new / change functionality over time. Sometimes changes are made that are not backwards compatible, but make the product better. As long as users know how to go about the changes and have time to adapt, it should be fine.

Antti-L commented 8 months ago

@akanudia @olejandro : Returning to the current issue, I myself found it unclear what Olex was actually suggesting, but at the risk of repeating myself, I explain my view of the issue below:

In the example Olex provided, he had had a pattern cname* used in the CSet_CN field for an attribute expecting a commodity index. And he said that he later introduced a commodity group with the name cname . which led to Veda throwing a number of errors, and that caused Olex both confusion and disappointment.

As described, the issue was clearly about an attribute expecting a commodity index, and I see basically two main things to consider about what VEDA does:

  1. Did VEDA accept or reject something unexpected for the commodity index of the attribute? I don't think so, because the user-defined group was rejected in this case, and would only have been accepted if it had been defined as a commodity (my tests confirm that being so). And I think accepting only valid commodities for a commodity index, like VEDA does, is indeed most reasonable.

  2. Did VEDA do well, or not-so-well, in issuing those error messages about invalid commodities? I think it was not-so-well in this wildcard filter case, because the error messages were related to existing labels having an invalid item type for the attribute, and as such those errors were not consistent with the behaviour of wildcard filters elsewhere (e.g. wildcard filters under PSet_PN do not raise errors about any existing labels). However, to my surprise, Olex told above that he nonetheless considers such error messages about existing labels of invalid types useful.

Some illustrative examples about the wildcard filter behaviour:

Therefore, as far as I see, changing 2), by removing the unexpected error messages when using a wildcard filter would indeed improve consistency and user experience here. However, as Olex already disapproved my suggestion about it, I am left with confusion (and some disappointment), as I cannot see much anything else that could be reasonably changed to improve handling the case Olex described. Anyway, I think the wildcard filter behaviour is definitely inconsistent also with respect to NRGI/NRGO/MATI/MATO/ENVI/ENVO being currently accepted as commodities for attributes when using a wildcard filter, although they are not even CGs.

olejandro commented 8 months ago

Thanks @Antti-L. I am sorry, if I wasn't clear. My main issue is with the list the filter is applied to. Given the column name, i.e. Cset_CN, I expected the filter to be applied to the list of all commodities; it seems to be applied to the list of (not all, according to your findings?) commodity groups. Here the column name sets rather clear expectations on the input, so anything different feels inconsistent.

@Antti-L, my main issue with removing those error messages is that it will just "hide" the current behaviour. At least now one can find out about it.

Antti-L commented 8 months ago

Well, I was by no means suggesting to hide the inconsistent behaviour, but only to fix it. :smile: In the case you reported, there was no inconsistent behaviour related to the generation of model database records from the wildcard filter. The wildcard filter generated database records only for commodities, fully as expected. The inconsistent behaviour was related to the generation of error messages from the wildcard filter, which produced unexpected errors for UDCGs that were not commodities. I was suggesting to fix that issue, and of course, fixing it should not change the generation of database records, which was working fine, but only the generation of error messages, where the wildcard filter should in my view not have been applied to those UDCGs that were not commodities. Removing the unexpected messages would be a reasonable fix, while just hiding them would not make sense.

olejandro commented 8 months ago

@Antti-L I've been trying to adjust my language to be more clear, but it doesn't seem to be working. E.g. ”wildcard filter generated database records only for commodities, fully as expected" I don't think this is a correct statement, unless what you are referring to as a database is the gams input.

I'll reach out by email to find time for a call to discuss this and clarify any misunderstandings as they occur.

olejandro commented 7 months ago

@Antti-L and I have discussed this separately and seem to agree that:

@Antti-L please feel free to add if I missed anything. Thanks a lot for engaging on this!