Unidata / UDUNITS-2

API and utility for arithmetic manipulation of units of physical quantities
http://www.unidata.ucar.edu/software/udunits
Other
62 stars 36 forks source link

scanner & parser: Including a `.c` file causes trouble for other build systems #67

Open schwehr opened 6 years ago

schwehr commented 6 years ago

Having parser.c include scanner.c is trouble for me. I'm building udunits with bazel and having to rework the scanner and parser. It would be great to get to a more traditional setup for the lexer and parser with each as a separate compilation using and each having a header.

semmerson commented 6 years ago

I recall I did this for a reason -- but I don't recall the reason (yet another example of why comments should state the "why" of things); consequently, I'll have to think about it.

schwehr commented 6 years ago

I will see if I can get a setup working today that does this as two separate compilation units with headers.

schwehr commented 6 years ago

I got a first pass of separating scanner.c from parser.c. It's still missing things like a common header for functions and globals that cross, but it's a start.

diff --git a/lib/CMakeLists.txt b/lib/CMakeLists.txt
index 17b2304..4f6a622 100644
--- a/lib/CMakeLists.txt
+++ b/lib/CMakeLists.txt
@@ -28,12 +28,12 @@ IF(UNIX)
     add_custom_command(
         OUTPUT ${CMAKE_CURRENT_SOURCE_DIR}/scanner.c
         WORKING_DIRECTORY ${CMAKE_CURRENT_SOURCE_DIR}
-        COMMAND ${UD_FLEX} -d -Put -o scanner.c scanner.l
+        COMMAND ${UD_FLEX} -Put -o scanner.c scanner.l
         DEPENDS ${CMAKE_CURRENT_SOURCE_DIR}/scanner.l)
     add_custom_command(
         OUTPUT ${CMAKE_CURRENT_SOURCE_DIR}/parser.c
         WORKING_DIRECTORY ${CMAKE_CURRENT_SOURCE_DIR}
-        COMMAND ${UD_BISON} -t -p ut -o parser.c parser.y
+        COMMAND ${UD_BISON} -p ut --defines -o parser.c parser.y
         DEPENDS ${CMAKE_CURRENT_SOURCE_DIR}/parser.y)
     set_source_files_properties(parser.c
         PROPERTIES OBJECT_DEPENDS ${CMAKE_CURRENT_SOURCE_DIR}/scanner.c)
@@ -45,6 +45,7 @@ SET(libudunits2_src converter.c
            idToUnitMap.c
            parser.c
            prefix.c
+           scanner.c
            status.c
            systemMap.c
            unitAndId.c
diff --git a/lib/parser.y b/lib/parser.y
index e727510..e09767f 100644
--- a/lib/parser.y
+++ b/lib/parser.y
@@ -28,14 +28,20 @@
 #include <string.h>
 #include <strings.h>
 #endif
+
 #include "udunits2.h"
+#include "prefix.h"
+#include "parser.h"
+#include "scanner.h"
+
+char *utGetYyCBufP(void);

 static ut_unit*        _finalUnit; /* fully-parsed specification */
 static ut_system*  _unitSystem;    /* The unit-system to use */
 static char*       _errorMessage;  /* last error-message */
 static ut_encoding _encoding;  /* encoding of string to be parsed */
-static int     _restartScanner;/* restart scanner? */
-static int     _isTime;        /* product_exp is time? */
+int        _restartScanner;/* restart scanner? */
+int        _isTime;        /* product_exp is time? */

 /*
@@ -525,7 +531,7 @@ timestamp:  DATE {
 #define yyname     utyyname
 #define yyrule     utyyrule

-#include "scanner.c"
+/* #include "scanner.c" */

 /*
@@ -652,7 +658,8 @@ ut_parse(

             if (utparse() == 0) {
                 int     status;
-                int    n = yy_c_buf_p  - buf->yy_ch_buf;
+                /* Offset from the beginning of the buffer */
+                const int n = utGetYyCBufP() - buf->yy_ch_buf;

                 if (n >= strlen(utf8String)) {
                     unit = _finalUnit; /* success */
diff --git a/lib/prefix.h b/lib/prefix.h
index 63b9dc1..621a0df 100644
--- a/lib/prefix.h
+++ b/lib/prefix.h
@@ -26,12 +26,13 @@ extern "C" {
  * UT_EXISTS   "name" already maps to a different value.
  * UT_OS       Operating-system failure.  See "errno".
  */
+# if 0
 ut_status
 ut_add_name_prefix(
     ut_system* const   system,
     const char* const  name,
     const double   value);
-
+#endif
 /*
  * Adds a symbol-prefix to a unit-system.  A symbol-prefix is something like
  * "M" or "y".  Comparisons between symbol-prefixes are case-sensitive.
diff --git a/lib/scanner.l b/lib/scanner.l
index 85c6623..f5c3da6 100644
--- a/lib/scanner.l
+++ b/lib/scanner.l
@@ -10,6 +10,9 @@
  */

 %option noyywrap
+%option nounput
+%option header-file="scanner.h"

 %{

@@ -22,7 +25,15 @@
 #include <string.h>
 #include <time.h>

-/**
+#include "udunits2.h"
+#include "parser.h"
+   // #include "scanner.h"
+
+int        _restartScanner;/* restart scanner? */
+int        _isTime;        /* product_exp is time? */
+#define yylval          utlval
+
+   /**
  * Decodes a date.
  *
  * @param[in]  text     The text specifying the date to be decoded.
@@ -92,6 +103,10 @@ static int decodeReal(
     return ERR;
 }

+char *utGetYyCBufP(void) {
+  return yy_c_buf_p;
+}
+
 %}

 space          [ \t\r\f\v]
semmerson commented 6 years ago

Modifying parser.c and scanner.c won't work because they're created from parser.y and scanner.y. You need to modify the latter files.

The .c files are included in the codebase because not all yacc(1)s and lex(1)es were the same when the package was created.

I probably need to revisit that decision.

semmerson commented 6 years ago

@schwehr I forgot to address you in my previous comment. I hope you see it.

schwehr commented 6 years ago

@semmerson , If you look at my patch, I am editing CMakeLists.txt, parser.y, and scanner.l. I'm not following your comment about modifying parser.c and scanner.c. I'm only changing them transitively via flex and bison output. If you are not used to diff/patch files, these are "unified" diff format (the default for git diff) and you can find the modified files by searching for the string +++. e.g. I'm modifying parser.y starting at this text:

diff --git a/lib/parser.y b/lib/parser.y
index e727510..e09767f 100644
--- a/lib/parser.y
+++ b/lib/parser.y
@@ -28,14 +28,20 @@

I did also have to edit prefix.h to #if 0 out ut_add_name_prefix as I was getting two definitions as it is also defined in udunits2.h

Or was there something else to your comment that I missed?

semmerson commented 6 years ago

You're right. I'm used to patch files -- I just didn't recognize my own handiwork.

When successful, will you put this in the form of a pull request?

schwehr commented 6 years ago

I will do a multi-commit pull request so that each step is easier to follow