Qucs / ADMS

ADMS is a code generator for the Verilog-AMS language
GNU General Public License v3.0
94 stars 32 forks source link

include paths and include directive tidy up #102

Open felix-salfelder opened 2 years ago

felix-salfelder commented 2 years ago

Currently admsXml adds '.' (current working directory) to the back of the include path list. This seems wrong and unusual. The C preprocessor has a more well behaved include directive, and does not do this.

NB: C has two different includes (quotes vs angle brackets) and the comparison might be inappropriate. It would be better to compare with some other verilog tool.

NB2: The reason the extra '.' has been added is probably related to the generation of header files (in '.'). This only caused problems (due to race conditions) and is no longer needed.

As a start, the implicit '.' should be removed, and instead specified by the user ('-I.') on demand. Then, less importantly, include directives should be resolved relative to the file they occur in, such that files in the same directory are found (unless found in a path specified by -I).

ngwood commented 2 years ago

Hi Felix.

It is actually at the front still.

Current search order for devel as of 351f20e is

  1. current working directory;
  2. command line include directories;
  3. ADMS include directory.

This is what we want, I think.

ngwood commented 2 years ago

Tested by putting BROKEN constants.vams file in current working directory.

bash> admsXml -x dummy.va 
[info...] -x: skipping any implicit xml scripts
[info...] admsXml-2.3.7 (unknown) Jan  2 2022 01:08:04
[fatal..] constants.vams: during lexical analysis syntax error at line 36 -- see 'THIS_IS_A_BAD_LINE'
bash> admsXml -x -I./include dummy.va 
[info...] -x: skipping any implicit xml scripts
[info...] admsXml-2.3.7 (unknown) Jan  2 2022 01:08:04
[fatal..] constants.vams: during lexical analysis syntax error at line 36 -- see 'THIS_IS_A_BAD_LINE'
bash> rm constants.vams 
bash> admsXml -x -I./include dummy.va 
[info...] -x: skipping any implicit xml scripts
[info...] admsXml-2.3.7 (unknown) Jan  2 2022 01:08:04
[info...] elapsed time: 0 (second)
[info...] admst iterations: 0 (0 freed)
ngwood commented 2 years ago

I see no issue with removing the implicit -I. though, especially as this is not a standard thing to do and may catch people out! I also like the idea of adding in the implicit relative search path from the file containing the include.

felix-salfelder commented 2 years ago

On Mon, Jan 03, 2022 at 07:39:15AM -0800, Neal wrote:

Current search order for devel as of 351f20e is

  1. current working directory;
  2. command line include directories;
  3. ADMS include directory.

This is what we want, I think.

It should be changed to

  1. command line include directories;
  2. ADMS include directory.

because it makes more sense: you can add -I., but you can't un-add it. Or: Find how do other tools implement `include, and decide based on that.

Your -I patch + this fix + constants.vams is a good incentive to make a release I reckon.

ngwood commented 2 years ago

Removing it makes sense. I know that doing this alone will cause the Xyce buildxyceplugin script and all of their internal Makefiles to break for multi-file models though. Worst case, they can just add an explicit -I. in the files and still be usable with previous versions of ADMS. (This would also be a good reason for buildxyceplugin to add an interface to -I; it's inconvenient not having one!)

It would be good to also implicitly look relative to the file being processed though, as that is fairly standard practice; e.g.,

`include "relative/include/file.vams"

should look relative to the file being proceed first (instead of the CWD) and then relative to any path declared by by the user with an explicit -I. All that is required to achieve this is to find a way to add the directory of the file being processed to the includePath list, which seems not too tricky. This would also mean that the Xyce build tools would continue to work unchanged.

ngwood commented 2 years ago

Your -I patch + this fix + constants.vams is a good incentive to make a release I reckon.

I would suggest we fix issue #39 too. The earliest version of disciplines.vams available from Accellera cannot be processed with ADMS, in part, because of this. I don't know enough about Bison to fix this myself (yet). There is also another syntax change that would need to be fixed; again, a Bison change.

ngwood commented 2 years ago

I found a fix for issue #39 already. The last related thing I would suggest to get ADMS usable with the most recent Accellera standard library files is to get around their (presumably deliberate) use of an escaped identifier inside the disciplines.vams file. These are identifiers that are allowed to have extra special characters (not just $) in the name; namely any printable ASCII character, such as @, ", {, etc. This is a universal feature of the Verilog-AMS language, but no one actually uses this feature of the language as I don't think anyone supports it. Given that the offending identifier \logic (note the space at the end) is equivalent to the identifier logic (trailing space is irrelevant in free-form language) I suggest we do a nasty hack!

bash> git diff
diff --git a/admsXml/verilogaLex.l b/admsXml/verilogaLex.l
index ea1dfe4..4faabbb 100644
--- a/admsXml/verilogaLex.l
+++ b/admsXml/verilogaLex.l
@@ -274,6 +274,7 @@ INF {TKRETURN(yytext,yyleng); return tk_inf;}
 \|\| {TKRETURN(yytext,yyleng); return tk_or;}
 \^\~ {TKRETURN(yytext,yyleng); return tk_bitwise_equr;}

+\\{ident} {TKRETURN(yytext,yyleng); return tk_ident;}
 \${ident} {TKRETURN(yytext,yyleng); return tk_dollar_ident;}
 {char} {TKSTRIPPEDRETURN(yytext,yyleng); return tk_char;}
 {b8_int} {TKRETURN(yytext,yyleng); return tk_integer;}

It's not great but it's much better than not doing anything at all. We'd just need to document the heck out of this!

What do you think?

tvrusso commented 2 years ago

Just weighing in here: I do not really like the suggestion of removing "search current directory" as a default behavior for ADMS, mostly because of the impact it will have on Xyce's use of ADMS.

Verilog-A models are often shipped as bundles of files that include each other in the same directory, and requiring an explicit "search this directory" command line option would be annoying given that it has always been the default behavior for as long as ADMS has existed.

Could we please just leave it as it is?

felix-salfelder commented 2 years ago

On Mon, Jan 03, 2022 at 11:43:50AM -0800, Tom Russo wrote:

Verilog-A models are often shipped as bundles of files that include each other in the same directory, and requiring an explicit "search this directory" command line option would be annoying given that it has always been the default behavior for as long as ADMS has existed.

I see your point. Note that "include each other in the same directory" was never the behaviour (unless that directory happens to be '.').

Could we please just leave it as it is?

If someone sends a patch to make "include each other in the same directory" work, I will certainly consider it. Better pass '-I.' where needed.

tvrusso commented 2 years ago

I will note that other Verilog-A compilers (Agilent's, for example) search for include files without a specified path by looking first in the same directory as the file being processed, then in any other search path given to the compiler.

If ADMS were changed to do that, I'd be behind it (though I'm not going to submit a PR to make it happen). But until that happens, please do not remove the default of looking in the current working directory first. Especially since this is long-standing behavior since the dawn of ADMS.

felix-salfelder commented 2 years ago

On Mon, Jan 03, 2022 at 12:21:00PM -0800, Tom Russo wrote:

I will note that other Verilog-A compilers (Agilent's, for example) search for include files without a specified path by looking first in the same directory as the file being processed, then in any other search path given to the compiler.

Thanks. That confirms my suspicion.

If ADMS were changed to do that, I'd be behind it (though I'm not going to submit a PR to make it happen). But until that happens, please do not remove the default of looking in the current working directory first. Especially since this is long-standing behavior since the dawn of ADMS.

Agreed, lets do all or nothing.

ngwood commented 2 years ago

This appears to be all that is needed for this.

diff --git a/admsXml/admsXml.c b/admsXml/admsXml.c
index 5d314b4..9dba440 100644
--- a/admsXml/admsXml.c
+++ b/admsXml/admsXml.c
@@ -615,7 +615,7 @@ static void parseva (const int argc,const char** argv,char* myverilogamsfile)
     pproot()->error=0;
     adms_slist_push(&pproot()->skipp_text,(p_adms)(long)(0));
     pproot()->includePath=getlist_from_argv(argc,argv,"-I","directory");
-    adms_slist_push(&pproot()->includePath,(p_adms)".");
+    adms_slist_push(&pproot()->includePath,(p_adms)dirname(myverilogamsfile));
 #ifdef ADMS_INCLUDEDIR
     adms_slist_concat(&pproot()->includePath,adms_slist_new((p_adms)ADMS_INCLUDEDIR));
 #endif

This would make ADMS behave similarly to other Verilog-A compilers. I am fairly sure that this wouldn't interfere with how Xyce uses ADMS. The CWD could always be manually added with the -I. command line option.

felix-salfelder commented 2 years ago

On Mon, Jan 17, 2022 at 04:36:04AM -0800, Neal wrote:

  • adms_slist_push(&pproot()->includePath,(p_adms)".");
  • adms_slist_push(&pproot()->includePath,(p_adms)dirname(myverilogamsfile));

This would make ADMS behave similarly to other Verilog-A compilers. I am fairly sure that this wouldn't interfere with how Xyce uses ADMS. The CWD could always be manually added with the -I. command line option.

Thanks for looking into this one.

I have doubts about the correctness of the approach, and I am unsure what "correct" even means. How about

A.va # include "sub/B.va" sub/B.va #include "C.va" sub/C.va # the desired file C.va # a stray (but wrong) file.

Will it not include the wrong one? (What will others do here?)

What happens if -I. is passed in the above example? will C.va or sub/C.va win? Is there anything in the standard on this at all?

(Same question with -Isub, and no "sub/" in A.va.)

I like the idea of a tradeoff between unconditional -I. and a correct implementation, and I think your patch has a chance. What if users start depending on odd behaviour (again)?

ngwood commented 2 years ago

Ahh yes. This will not work with nested include files. It will only add the directory of the base Verilog-A file to the include path. I will give this some more thought.

ngwood commented 2 years ago

The Verilog-AMS standard refers to the Verilog standard, which does not explicitly specify how relative paths should be evaluated.

From what I can see, any change would need to occur in the adms_preprocessor_lex_include_file function in preprocessorLex.l. It would be a lot of work to get this feature working though. It looks like it would be possible to go through each scanner of pproot()->Scanner and use the scanner->filename entries to construct temporary directory paths to push onto pproot()->includePath right before using adms_file_open_read_with_path that can be removed immediately after creating the file handle.

ngwood commented 2 years ago

I think this does what you're after.

diff --git a/admsXml/preprocessorLex.l b/admsXml/preprocessorLex.l
index 565ae9f..884bdbf 100644
--- a/admsXml/preprocessorLex.l
+++ b/admsXml/preprocessorLex.l
@@ -33,6 +33,8 @@ along with this program.  If not, see <http://www.gnu.org/licenses/>.
 #include "admsPreprocessor.h"
 #include "preprocessorYacc.h"

+char* dirname(const char* myname);
+
 #ifndef INITIAL
 #  define INITIAL 0
 #endif
@@ -133,12 +135,20 @@ static void adms_preprocessor_lex_include_file (char* fileName)
   FILE*myfh;
   p_preprocessor scanner;
   char*message=NULL;
+  char*mydir=NULL;
+  p_slist l; for(l=pproot()->Scanner;l;l=l->next)
+  {
+    adms_k2strconcat(&mydir,dirname(((p_preprocessor)l->data)->filename));
+    adms_k2strconcat(&mydir,ADMS_PATH_SEPARATOR);
+  }
+  adms_k2strconcat(&mydir,dirname(pproot()->cr_scanner->filename));
   adms_k2strconcat(&message,pproot()->cr_scanner->filename);
   adms_k2strconcat(&message,":");
   adms_strconcat(&message,adms_integertostring(adms_preprocessor_get_line_position (pproot()->cr_scanner, 0)));
   if(pproot()->cr_filename)
     free(pproot()->cr_filename);
   pproot()->cr_filename=strdup(fileName);
+  adms_slist_push(&pproot()->includePath,(p_adms)mydir);
   if(!(myfh=adms_file_open_read_with_path(fileName,(p_slist)(pproot()->includePath))))
   {
     if(!strcmp(fileName,"discipline.h")||!strcmp(fileName,"disciplines.h")||!strcmp(fileName,"discipline.vams")||!strcmp(fileName,"disciplines.vams"))
@@ -161,6 +171,7 @@ static void adms_preprocessor_lex_include_file (char* fileName)
     else
       adms_message_fatal(("[%s]: failed to open file ... '%s'\n",message,fileName))
   }
+  adms_slist_pull(&pproot()->includePath);
   scanner=(p_preprocessor)malloc(sizeof(t_preprocessor));
   adms_message_verbose(("include file '%s'\n",fileName))
   scanner->buffer=NULL;
@@ -182,6 +193,7 @@ static void adms_preprocessor_lex_include_file (char* fileName)
   adms_k2strconcat(&preprocessorlval.mystr,"\"\n");
   BEGIN( INITIAL );
   free(message);
+  free(mydir);
 }

 static char *adms_preprocessor_lex_skipp_text ()

I think the implicit extern declaration for dirname in admsPreprocessor.h masks the explicit static keyword attached to the definition in admsXml.c.

Here is an example of it working.

$ export adms_verbose="yes"
$ tree dir1/
dir1/
├── dummy.va
└── include1
    ├── include1.va
    ├── include2
    │   └── include2.va
    └── include3
        └── include3.va

3 directories, 4 files
$ cat dir1/dummy.va 
// dummy.va

`include "constants.vams"
`include "disciplines.vams"

`include "include1/include1.va"
//`include "include3/include3.va"

module dummy(a, b);
    inout a, b;
    electrical a, b;
endmodule
$ cat dir1/include1/include1.va 
`define A 1.0
`include "include2/include2.va"
$ cat dir1/include1/include2/include2.va 
`define A 1.0
$ cat dir1/include1/include3/include3.va 
`define C 3.0
$ admsXml dir1/dummy.va 
[verbose] shift: -f dir1/dummy.va
[info...] admsXml-2.3.7 (unknown) Jan 19 2022 19:07:38
[verbose] define macro ... 'insideADMS'
[verbose] create temporary file .dummy.va.adms
[verbose] include file 'constants.vams'
[verbose] include file 'disciplines.vams'
[verbose] include file 'include1/include1.va'
[verbose] include file 'include2/include2.va'
[warning] pragma redefined ... 'A'
[verbose] No error found during parsing
[verbose] -e file: .adms.implicit.xml
[verbose] traverse: .adms.implicit.xml
[verbose] .interface.xml: file created (all -e files in one file)
[info...] elapsed time: 0 (second)
[info...] admst iterations: 2225 (364 freed)

If you uncomment the include for include3/include3.va in dir1/dummy.va then you will correctly get a fatal error because it is not supposed to look in dir1/include1.

[fatal..] [dir1/dummy.va:7]: failed to open file ... 'include3/include3.va'
ngwood commented 2 years ago

I'm not 100% sure on whether this will work on Windows yet. Also, the dirname function had already been defined statically in admsXml.c meaning it's not defined in library admsElement like almost all other functions of this kind are. admsXml.c is used to create admsXml; but if you ever wanted to link against the admsPreprocessor library elsewhere you'd run into problems. Is it worth relocating the dirname and dependent functions? They could be made part of mkelements.pl. I'm thinking it would be safer and more consistent. Could we rename dirname to adms_dirname at the same time perhaps, as dirname is a standard library (string.h) function name?

felix-salfelder commented 2 years ago

On Sun, Jan 23, 2022 at 09:46:29AM -0800, Neal wrote:

Also, the dirname function had already been defined statically in admsXml.c meaning it's not defined in library admsElement like almost all other functions of this kind are. admsXml.c is used to create admsXml; but if you ever wanted to link against the admsPreprocessor library elsewhere you'd run into problems. Is it worth relocating the dirname and dependent functions? They could be made part of mkelements.pl. I'm thinking it would be safer and more consistent. Could we rename dirname to adms_dirname at the same time perhaps, as dirname is a standard library (string.h) function name?

Sounds reasonable to me.

As I understand, the static implementation of dirname should be moved to the library and exported as adms_dirname together with the others.

ngwood commented 2 years ago

I was mostly suggesting this for consistency. I did test that it works. I'm now thinking it's better if we don't, given what I know now.

On Windows, the win32_interface is a typedef of __declspec(dllexport) or __declspec(dllimport), depending on whether the function is being built into or accessed from a DLL, respectively; because the dirname function defined in admsXml.c never ends up in a DLL, there wouldn't be any issue for Windows users on this front. When I say I'm not 100% sure, it's because I don't have a Windows machine to test this with.

We only use libadmsPreprocessor for building admsXml (from admsXml.c), so we really don't need to make it available when building the other libraries, which is what putting it into libadmsElement would achieve. We could remove the static scope identifier from the dirname declaration in admsXml.c to make it obvious that this symbol is potentially used elsewhere, I guess. I suspect why I can get away with leaving it defined as static is because admsXml is built from a single file and libadmsPreprocessor is only ever used for building this one file.

ngwood commented 2 years ago

I think #109 as it stands (https://github.com/Qucs/ADMS/pull/109/files) is the best solution.

felix-salfelder commented 2 years ago

On Mon, Jan 24, 2022 at 06:27:47AM -0800, Neal wrote:

I suspect why I can get away with leaving it defined as static is because admsXml is built from a single file and libadmsPreprocessor is only ever used for building this one file.

On my system, libadmsPreprocessor is a shared library, and does not prescribe the use together with admsXml.c. I don't understand how it can use a static function from another binary. Does it not call the one from dirname(3)?

$ ldd which admsXml linux-vdso.so.1 (0x00007ffd495af000) libadmsElement.so.0 => /usr/lib/x86_64-linux-gnu/libadmsElement.so.0 (0x00007ff450f45000) libadmsVeriloga.so.0 => /usr/lib/x86_64-linux-gnu/libadmsVeriloga.so.0 (0x00007ff450f2e000) libadmsPreprocessor.so.0 => /usr/lib/x86_64-linux-gnu/libadmsPreprocessor.so.0 (0x00007ff450f1c000) libadmsAdmstpath.so.0 => /usr/lib/x86_64-linux-gnu/libadmsAdmstpath.so.0 (0x00007ff450ed2000) libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007ff450cd3000) /lib64/ld-linux-x86-64.so.2 (0x00007ff450f99000)

admsXml.c looks messy. Perhaps I need to build with your patch and look closer.

ngwood commented 2 years ago

Ahh! You are correct that libadmsPreprocessor, in my patch, it not using the dirname from admsXml.c at all; it must somehow be finding dirname(3). We could declare the ADMS dirname in a common location and rename it adms_dirname as I previously suggested; putting it in libadmsElement sounds sensible as it is already used when building both libadmsPreprocessor and admsXml. There are quite a few static function in admsXml.c. We would only really need to move those related to file path manipulation. I will update my pull request.

felix-salfelder commented 2 years ago

On Mon, Jan 24, 2022 at 08:55:46AM -0800, Neal wrote:

Ahh! You are correct that libadmsPreprocessor, in my patch, it not using the dirname from admsXml.c at all; it must somehow be finding dirname(3). We could declare the ADMS dirname in a common location and rename it adms_dirname as I previously suggested; putting it in libElements sounds sensible as it is already used when building both libadmsPreprocessor and admsXml. There are quite a few static function in admsXml.c. We would only really need to move those related to file path manipulation. I will update my pull request.

We are getting there.

If adms_dirname implements the same as dirname from POSIX.1-2001, I don't think it is sensible to add it to any of the adms libraries.

The static functions in admsXml.c are misplaced at best, and should be (re)moved. If they were added in an attempt to use a non-POSIX platform, they need to be enabled only as needed (e.g. on a specific platform), and wrap to corresponding library calls if applicable.

Maybe just #include (to get the dirname declaration) and call it a day.

ngwood commented 2 years ago

I think the static functions in admsXml.c are there for no other reason than that is the only file they are used in. They were almost all ADMS-specific functions. The problem was I needed to use the ADMS dirname function (which we already know works well and across multiple platforms) but it was out of scope. All I've done is enlarge the scope. The ADMS version of dirname is doing a similar thing to the POSIX function, but it a very ADMS way; I'm not sure what the consequences are of replacing it are. I wouldn't want to make any more changes myself at this stage as I'm reasonably sure I haven't broken anything at this point.

One thing to bare in mind is that paths in Verilog-A source code will use the forward slash as the path separator. Cygwin/MinGW versions of libgen appear to simultaneously handle forward and backward slashes, but not as robustly as one might want them to. I have no idea how a native Windows build might behave or whether libgen is available for a given toolset. Selecting how and when to use adms_dirname also requires extra work.

There is one notable difference. The adms_dirname will always behave is the same way, which is ever so slightly different than the POSIX dirname function; i.e., it normalises the file path to only use ADMS_PATH_SEPARATOR.

felix-salfelder commented 2 years ago

On Mon, Jan 24, 2022 at 01:33:32PM -0800, Neal wrote:

I wouldn't want to make any more changes myself at this stage [..]

I'd be happier if we didn't add more functions to the interface, especially not when they are questionable or bogus or unneeded.

How about pasting an inline/static definition of (adms_?)dirname where you need it, and leave the tidy-up to future devs?

Cleaning up the code will have fewer side effects than reverting the interface changes...

ngwood commented 2 years ago

I see your point. I originally thought the interface change would be less bad than repeating code. If the intention is to refactor later on then I agree that it'd be better to add a static function that can be sorted out later; this way all of the changes needed to get this working are localised to just one file. I have updated my pull request.