carlos-montiers / enhancedbatch

Enhances your windows command prompt https://www.enhancedbatch.com
Other
5 stars 1 forks source link

Modify heap variable and extension syntax for better backward compatibility #40

Open DaveBenham opened 4 years ago

DaveBenham commented 4 years ago

This issue was split out from #39 to make it easier to track.

EB uses $ prefixed names for heap variables, and @ prefixed names for extensions. I've seen many scripts that already use variable names that begin with those characters. Perhaps switching from environment variable to heap variable might not break existing code, but inability to access variables that begin with @ will definitely break existing code. At DosTips we developed a convention of naming batch macros with an @ prefix. For example, EB breaks my batch port of colossal cave adventure.

In addition to the prefix problem, use of ; will break any code that happens to use ; in an environment variable name. I don't think this is a common practice, I would still like to ensure that EB never breaks variable expansion for existing code.

I believe all the issues can be solved by using =, since batch does not have the ability to define a variable with = in the name. There are the undocumented predefined variables that begin with =, but they are few and far between.

My proposed new syntax is as follows:

So the following code using old syntax !$var;lower;10! would now be written as !=$var=lower;10! instead.

I know the syntax is not nearly as elegant, but I think backward compatibility is more important than syntax elegance. I'm confident that the expansion code could be modified quite easily to satisfy my proposed syntax. But I am concerned about the code to SET the values. Your patch point would undoubtedly change. Native cmd.exe raises a syntax error when it sees an attempt to define a variable that begins with =. I'm hoping you can intercept that error and conditionally handle the SET if the name begins with =@ or =$.

If this change can be made, then it would be nearly impossible for EB to break existing code related to variable expansion. The only exception is SUBST allows definition of logical drives named @: and $:. There are the undocumented =@: and =$: variables that would hold the current directory associated with those volumes, and EB would block access to those values. I am not concerned about that potential because:

DaveBenham commented 4 years ago

This is actually the original response from @adoxa back when this was part of #39.

Heap variable names would be prefixed by =$

I think equals itself would qualify as the prefix, so no real need for $, too. Then again, the extra distinction between name and value is useful. In any event, using = makes creating array variables extremely awkward: set =$array=$index=value. Curly expansion makes it look better - set =$array{=$index}=value - but the underlying problem remains. Using another two-character prefix would be far easier, maybe ~$ and ~@?

Filters would be introduced by = after the variable name instead of ;.

I chose ; to match :, thinking it would be rare enough in variable names to get away with it. I could consider =, but it really doesn't look right.

I know the syntax is not nearly as elegant, but I think backward compatibility is more important than syntax elegance.

Backward compatibility is a double-edged sword and is partly the reason cmd.exe is in the mess it's in. I think anyone prepared to use EB should be prepared to modify their existing scripts to work with it.

carlos-montiers commented 4 years ago

I think that in the case of the extensions using the AT as suffix should not changed, first because I think are more easy of remember, also relies in that in cmd you can not use call AT https://www.dostips.com/forum/viewtopic.php?f=3&t=9230&p=60092#p60092 Thus the use of it as a callable extension not break existing code, because you cannot use in that way in the normal cmd. Use the = as prefix instead of $ for heap variables, mmm, maybe, I like more the $. I think the only think that maybe can cause a little issue, not tested is the feature of for of search the directories %~$PATH:I. For simple I think is better use only one convention, thus not =@ with @, only @. I agree with @adoxa :

anyone prepared to use EB should be prepared to modify their existing scripts to work with it.

carlos-montiers commented 4 years ago

@adoxa , @DaveBenham I thought on this a lot of time and I decide that is not good use the = character in the name of the variables, because is used as an operator and will lead to confusion. And use any character available for a variable name as prefix, can break any code that has a variable beginning with that character.

In this sense, the use of $ and @ as prefixes in the variables must be considered as a change that can break code using variables with that prefixes. I that case, that codes need be modified.

but inability to access variables that begin with @ will definitely break existing code.

EB does not prevent access to variables prefixed with @. It only will try to return the used in the list of extensions first:

>set @hello=Hi
>echo %@hello%
Hi

In summary: EB introduce little breaking changes, in the sense of the usage of the prefixes $ and @, and also is introducing other little breaking changes like the operator += in the set command.

Maybe an option for reducing the impact of the heap variables (always prefixed with $) is use an idea of @adoxa that consists in using a new extension. in this case @SET and it will be the only way of set heap variables. Thus:

Set $a=b // environment
Call @Set $a=b // heap

This can save the old codes that use the $ as a prefix in the variables

carlos-montiers commented 4 years ago

Hello. @DaveBenham , @adoxa. I'm thinking on remove the '$' prefix. Also, the heap variables should not be listed in the output of classic 'set' command. A new command: 'XSET' should be created. That will be analogous to set in the sense of list the variables, in this case, the heap variables and also can list the environment variables, in the output a letter can identify the storage ubication: H means heap and E means environment. 'XSET' will allow creating heap variables, and in the future also the type of the variables mentioned in issue #52.

adoxa commented 4 years ago

I would rather go the other way: SET always uses heap variables by default, with an option/command to use the environment. Adventure.bat is much quicker with EB without even using anything EB-specific (such as the length modifier). I haven't actually tested, but I think it's because of moving all its $ variables out of the environment and into the heap. Moving all the variables would probably have even more benefit. SET has to list heap variables in order for Adventure.bat to work.

ATM, for existing scripts to make use of heap variables requires changing variables, adding the $ prefix; your proposal requires changing the set command (it wouldn't be xset, it would be call @set); the above requires only changing variables that must be in the environment, which would be far fewer, if any.

carlos-montiers commented 4 years ago

@adoxa I think that all variables uses heap is really good (removing the $ prefix). All variables in the heap will support the local level provided by the setlocal?

adoxa commented 4 years ago

Ah, good point, I'll have to get setlocal working with heap variables before putting all variables in the heap.

carlos-montiers commented 4 years ago

When EB be loaded it should pass all that is in the environment to the heap? I not know how cmd implements the local contexts. Maybe it have variables per context, thus when tried to read a variable it looks in the current context, else search in the previous context?

adoxa commented 4 years ago

No, existing environment variables will remain. Heap will override, though, as we've discussed...somewhere. SetLocal creates a copy of the environment, EndLocal replaces the environment with the copy.

carlos-montiers commented 4 years ago

Mmm, but if we set in the heap a variable that also exists in the environment block, and after it we delete it, will be needed delete from the heap and from the environment, will not be an overhead of always check environment variables.?

adoxa commented 4 years ago

Well, that still happens now, not that you'd notice, since variables don't typically start with $ or @. Maybe it's not such a good idea, after all.

carlos-montiers commented 4 years ago

How hard is copy all the environment block to heap when EB is loaded and after it, create heap context with each setlocal?

adoxa commented 4 years ago

Not sure what you're asking, there. If EB uses heap for all variables it would still try the environment if it doesn't exist, so there's no need for a copy. But that raises the problem of deleting a variable: does it delete the environment, too, or not? Is that where your copy comes in? We copy the initial environment to the heap (not a problem) then solely use the heap, so deleting a variable stays deleted, but an environment variable will be restored when EB exits (since it was never actually deleted). That might be a sensible approach. Local would be done via separate contexts, so that would be trivial.

carlos-montiers commented 4 years ago

I think if we implement the type of variables with the properties address when deleting the variable we set the address to null, thus on the unload we iterate all the variables with null address, and remove it from the environment.