davidgiven / ack

The Amsterdam Compiler Kit
http://tack.sf.net
Other
420 stars 59 forks source link

PIM's FileSystem library is missing #53

Open trijezdci opened 7 years ago

trijezdci commented 7 years ago

PIM defines a low-level I/O and file system access library called FileSystem. See PIM 3rd edition page 178 and PIM 4th edition page 161. I don't have a copy of the 2nd edition at hand but I am dead certain that it is also part of the 2nd edition.

FileSystem is the smallest common denominator for PIM Modula-2. Using FileSystem as a base, it is possible to write portable code across different PIM implementations. InOut alone is insufficient for this purpose because it lacks certain functionality (namely checking if a file exists, delete and rename a file, all of which are essential).

I maintain a portable preprocessor for Modula-2 that is designed to compile on any Modula-2 compiler PIM or ISO.

https://github.com/m2sf/m2pp

For this I had to develop techniques and libraries to overcome the differences between the two dialects. In fact the primary purpose of the preprocessor is to allow the writing of portable code and let the preprocessor generate PIM and ISO versions if necessary.

Ironically, when FileSystem is available one can write portably across PIM implementations, but one cannot write portably across ISO Modula-2 implementations, precisely because the ISO standard doesn't provide FileSystem, nor any equivalent, nor any replacement for the missing functionality.

Thus, M2PP uses FileSystem as a base and provides per-compiler adapter libraries for ISO Modula-2 compilers to cover the missing functionality that the ISO standard doesn't provide.

There are eight ISO Modula-2 compilers, so we need eight separate implementations of the FileSystemAdapter library.

I believe a PIM compiler shouldn't require such an adapter. A PIM compiler should provide FileSystem in its library. So should ACK.

kernigh commented 7 years ago

I haven't seen a PIM book. I am trying to learn about FileSystem by looking at other Modula-2 implementations.

I know of one other PIM3 implementation, MacMETH. Its MacMETH 3.2 User Manual (PDF) describes FileSystem on page 61:

DEFINITION MODULE FileSystem;  (* W. Heiz, 4-Feb-86; A. Fischlin, 1992 *)

  FROM SYSTEM IMPORT WORD;

  TYPE
    Response  =  (done, notdone);

    File      =  RECORD
                 refNum,
                 volRef      : INTEGER;
                 firstPos,
                 lastPos,
                 curPos      : LONGINT;
                 res         : Response;
                 eof,
                 dirty       : BOOLEAN;
                 nameString  : ARRAY [0..63] OF CHAR;
                 buffer      : ARRAY [0..1024-1] OF CHAR;
                 END;

  PROCEDURE Lookup      (VAR f: File; filename: ARRAY OF CHAR; new: BOOLEAN);
  PROCEDURE Close       (VAR f: File);
  PROCEDURE Delete      (VAR f: File);
  PROCEDURE Rename      (VAR f: File; filename: ARRAY OF CHAR);
  PROCEDURE SetPos      (VAR f: File; highpos, lowpos: CARDINAL);
  PROCEDURE GetPos      (VAR f: File; VAR highpos, lowpos: CARDINAL);
  PROCEDURE Length      (VAR f: File; VAR highpos, lowpos: CARDINAL);
  PROCEDURE ReadWord    (VAR f: File; VAR w: WORD);
  PROCEDURE WriteWord   (VAR f: File; w: WORD);
  PROCEDURE ReadChar    (VAR f: File; VAR ch: CHAR);
  PROCEDURE WriteChar   (VAR f: File; ch: CHAR);

  PROCEDURE LastIOErr   ():INTEGER;

  TYPE FType  =   ARRAY [0..3] OF CHAR;

  PROCEDURE SetDefltFileType        (type, creator: FType);
  PROCEDURE SetCompFileTypes        (compCreator, sbmType, obmType, rfmType : FType);
  PROCEDURE ResetDefltFileTypes;
  PROCEDURE SetBrowsingLookupMode   (activate : BOOLEAN);
  PROCEDURE GetBrowsingLookupMode   (VAR activate : BOOLEAN);

END FileSystem.

Some parts are from PIM3, but some parts are specific to classic Mac OS. LastIOErr returns a Mac error code. FType is a 4-character type or creator code for Mac files. "Browsing Lookup Mode" means to open each file read-only, without the write lock. Most fields of File are specific to Mac. The nameString is the full path, but its limit of 63 characters is an artifact of the 1980s, before Apple made HFS.

I also found some descriptions of FileSystem that predate PIM3 (1985). So they might be PIM2. CFB Software's page about Lilith and Modula-2 links to a Modula-2 Handbook (PDF) from 1983. It's for M2M-PC, which is an IBM PC port of the compiler from Lilith. The handbook describes FileSystem in page 24:

DEFINITION MODULE FileSystem;    (* Medos-2 V3  S. E. Knudsen  1.6.81 *)

  FROM SYSTEM IMPORT ADDRESS, WORD;

  EXPORT QUALIFIED
    File, Response,

    Create, Close,
    Lookup, Rename,
    SetRead, SetWrite, SetModify, SetOpen,
    Doio,
    SetPos, GetPos, Length,

    Reset, Again,
    ReadWord, WriteWord,
    ReadChar, WriteChar.

  TYPE
    Response      = (done, notdone, notsupported, callerror,
                     unknownmedium, unknownfile, paramerror,
                     toomanyfiles, oem, deviceoff,
                     softparityerror, softprotected,
                     softerror, hardparityerror,
                     hardprotected, timeout, harderror);

    File          = RECORD
                       id: CARDINAL;
                      eof: BOOLEAN;
                      res: Response;
                    END;

  PROCEDURE Create(VAR f: File;  mediumname: ARRAY OF CHAR);
  PROCEDURE Close(VAR f: File);

  PROCEDURE Lookup(VAR f: File; filename: ARRAY OF CHAR; new: BOOLEAN);
  PROCEDURE Rename(VAR f: File; filename: ARRAY OF CHAR);

  PROCEDURE ReadWord(VAR f: File; VAR w: WORD);
  PROCEDURE WriteWord(VAR f: File; w: WORD);
  PROCEDURE ReadChar(VAR f: File; VAR ch: CHAR);
  PROCEDURE WriteChar(VAR f: File; ch: CHAR);

  PROCEDURE Reset(VAR f: File);
  PROCEDURE Again(VAR f: File);
  PROCEDURE SetPos(VAR f: File; highpos, lowpos: CARDINAL);
  PROCEDURE GetPos(VAR f: File; VAR highpos, lowpos: CARDINAL);
  PROCEDURE Length(VAR f: File; VAR highpos, lowpos: CARDINAL);

  PROCEDURE SetRead(VAR f: File);
  PROCEDURE SetWrite(VAR f: File);
  PROCEDURE SetModify(VAR f: File);
  PROCEDURE SetOpen(VAR f: File);
  PROCEDURE Doio(VAR f: File);

END FileSystem.

Some of the extra procedures do nothing. The one feature in M2M-PC but not in MacMETH seems to be Create, for opening an anonymous temporary file.

GNU Modula-2 has a "PIM [234] FileSystem compatible module" in its FileSystem.def, but has some differences:

PROCEDURE Create (VAR f: File) ;
PROCEDURE Delete (name: ARRAY OF CHAR; VAR f: File) ;

GNU's Create is not compatible with M2M-PC's Create, and GNU's Delete is not compatible with MacMETH's Delete. I have never used any Modula-2 implementations except ACK and MacMETH. If I find time, I might turn on my old Mac and play with MacMETH's FileSystem.

trijezdci commented 7 years ago

First off, I have both PIM3 and PIM4 and I'd be happy to send you the definitions of FileSystem.

As for MacMETH, this is a very problematic compiler as far as cross platform development is concerned. It has a number of deviations from any version of PIM that make it very very hard to support in any cross-platform project. For example, it removed the ability to cast altogether and then moved built-in function VAL() into module SYSTEM. While casting syntax is not portable between PIM and ISO, the VAL() function is, but MacMETH breaks this. There are plenty of other challenges. Thus MacMETH is not a good example to follow.

As for incompatible procedure signatures for Create and Delete, this is not a serious problem as long as the functionality is available. As long as there is a module FileSystem, one can provide a shim library to adapt the signatures, but if FileSystem is missing altogether, then how does one fill in the missing functionality. At that point maintaining portability to include compilers such as ACK will become a more serious effort, which means ACK may simply not be supported or it may eventually be dropped off the list of supported compilers because the project maintainers don't want to have to invest the extra effort anymore.

Now, I have made specific adaptation libraries for ACK to make sure that both M2PP (our preprocessor) and M2BSK (our Modula-2 Revision 2010 bootstrap kernel) can be built using ACK, thus having FileSystem or not is not a make or break for our project.

However, there is a lot of code out there written for MS-DOS compilers, such as Logitech Modula-2, TopSpeed Modula-2 and Fitted Systems Modula-2, all of which were quite popular in the 1980s. ACK wouldn't be able to compile that for lack of FileSystem and few people would want to invest the effort to port it to ACK.

I would therefore recommend that you implement at least a subset of FileSystem. You might want to follow GNU Modula-2's implementation. As I understand it, this was taken from the Logitech Modula-2 compiler.

Let me know if you want the definitions from the PIM3 and PIM4 books and where to send them to.

trijezdci commented 7 years ago

I figured I can post the definitions here:

(* Source: PIM 3rd Edition, page 178 *)
DEFINITION MODULE FileSystem (* S.E.Knudsen *)

  FROM SYSTEM IMPORT ADDRESS, WORD;

  TYPE
    Response =
      ( done, notdone, notsupported, callerror, unknownmedium, unknownfile, paramerror,
        toomanyfiles, eom, deviceoff, softparityerror, softprotected, softerror,
        hardparityerror, hardprotected, timeout, harderror );

    Command =
      ( create, open, close, lookup, rename, setread, setwrite, setmodify, setopen,
        doio, setpos, getpos, length, setprotect, getprotect, setpermanent,
        getpermanent, getinterval );

    Flag = ( er, ef, rd, wr, ag, bytemode );

    FlagSet = SET OF Flag;

    File = RECORD
      res : Response;
      bufa, ela, ina, topa : ADDRESS;
      elodd, inodd, eof : BOOLEAN;
      flags : FlagSet;
      CASE com : Command OF
        create, open, getinterval :
          fileno, versionno : CARDINAL
      | lookup :
          new : BOOLEAN;
      | setpos, getpos, length :
          highpos, lowpos : CARDINAL
      | setprotect, getprotect :
          wrprotect : BOOLEAN
      | setpermanent, getpermanent :
          on : BOOLEAN
      END; (* CASE *)
    END; (* RECORD *)

(* The routines defined by the file system can be grouped in routines for
    1. Opening, closing, and renaming of files.
       (Create, Close, Lookup, Rename)
    2. Reading and writing of files.
        (SetRead, SetWrite, SetModify, SetOpen, Doio)
    3. Positioning of files.
        (SetPos, GetPos, Length)
    4. Streamlike handing of files.
        (Reset, Again, ReadWord, WriteWord, ReadChar, WriteChar) *)

PROCEDURE Create ( VAR f : File; mediumname : ARRAY OF CHAR );
(* creates a new temporary (nameless) file f on the named device *)

PROCEDURE Close ( VAR f : file );
(* terminates the operations on file f, i.e. cuts off the connection between
    variable f and the file system. A temporary file will hereby be destroyed
    whereas a file with a not empty name remains in the directory for later use. *)

PROCEDURE Lookup ( VAR f : File; filename : ARRAY OF CHAR; new : BOOLEAN );
(* searches file 'filename'. If he file does not exist and 'new' is TRUE,
    a new file with the given name will be created. *)

PROCEDURE Rename ( VAR : f : File; filename : ARRAY OF CHAR );
(* changes the name of the file to 'filename'. If the new name is empty,
    f is changed to become a temporary file. *)

PROCEDURE SetRead ( VAR f : File );
(* initializes the file for reading. *)

PROCEDURE SetWrite ( VAR f : File );
(* initializes the file for writing. *)

PROCEDURE SetModify ( VAR f : File );
(* initializes the file for modifying. *)

PROCEDURE SetOpen ( VAR f : File );
(* terminates any input or output operations on the file. *)

PROCEDURE Doio ( VAR f : File );
(* is used in connection with SetRead, SetWrite and SetModify
    in order to read, write or modify a file sequentially. *)

PROCEDURE SetPos ( VAR f : File; highpos, lowpos : CARDINAL );
(* sets the current position of file f to byte
    highpos * 2**16 + lowpos. *)

PROCEDURE GetPos ( VAR f : File; VAR highpos, lowpos : CARDINAL );
(* returns the current byte position of file f. *)

PROCEDURE Length ( VAR f : File; VAR highpos, lowpos : CARDINAL );
(* returns the length of file f in highpos and lowpos. *)

PROCEDURE Reset ( VAR f : File );
(* sets the file into state opened and the position to the beginning of the file. *)

PROCEDURE Again ( VAR f : File );
(* prevents a subsequent call of ReadWord (or ReadChar) from reading the
    next value on the file. Instead, the value read just before the call of Again
    is returned once more. *)

PROCEDURE ReadWord ( VAR f : File; VAR w : WORD );
(* reads the next word on the file. *)

PROCEDURE WriteWord ( VAR f : File; w : WORD );
(* appends word w to the file. *)

PROCEDURE ReadChar ( VAR f : File; VAR ch : CHAR );
(* reads the next character on the file. *)

PROCEDURE WriteChar ( VAR f : File; ch : CHAR );
(* appends character ch to the file. *)

END FileSystem.
trijezdci commented 7 years ago
(* Source: PIM 4th Edition, page 161 *)
DEFINITION MODULE FileSystem (* S.E.Knudsen *)

  FROM SYSTEM IMPORT ADDRESS, WORD;

  TYPE
    Flag = ( er, ef, rd, wr, ag, bm );

    FlagSet = SET OF Flag;

    Response =
      ( done, notdone, lockerror, permissionerror notsupported, callerror,
        unknownmedium, unknownfile, filenameerror, toomanyfiles,
        mediumfull, deviceoff, parityerror, harderror );

    Command =
      ( create, open, opendir, close, rename, setread, setwrite, setmodify, setopen,
        doio, setpos, getpos, length, setpermission, getpermission, setpermanent,
        getpermanent );

    Lock = ( nolock, sharedlock, exclusivelock );

    Permission = ( noperm, ownerperm, groupperm, allperm );

    MediumType = ARRAY [0..1] OF CHAR;

    File = RECORD
      bufa, ela : ADDRESS;
      elodd : BOOLEAN;
      ina : ADDRESS;
      inodd : BOOLEAN;
      topa : ADDRESS;
      flags : FlagSet;
      eof : BOOLEAN;
      res : Response;
      CASE com : Command OF
        create, open :
          new : BOOLEAN;
          lock : Lock
      | opendir :
          selections : BITSET;
      | setpos, getpos, length :
          highpos, lowpos : CARDINAL
      | setpermission, getpermission :
          readpermission, modifypermission : Permission
      | setpermanent, getpermanent :
          on : BOOLEAN
      END; (* CASE *)
      mt : MediumType;
      mediumno : CARDINAL;
      fileno : CARDINAL;
      versionno : CARDINAL;
      openedfile : ADDRESS;
    END; (* RECORD *)

PROCEDURE Create ( VAR f : File; mediumname : ARRAY OF CHAR );
PROCEDURE Lookup ( VAR f : File; filename : ARRAY OF CHAR; new : BOOLEAN );
PROCEDURE Close ( VAR f : file );
PROCEDURE Rename ( VAR : f : File; filename : ARRAY OF CHAR );
PROCEDURE SetRead ( VAR f : File );
PROCEDURE SetWrite ( VAR f : File );
PROCEDURE SetModify ( VAR f : File );
PROCEDURE SetOpen ( VAR f : File );
PROCEDURE Doio ( VAR f : File );
PROCEDURE SetPos ( VAR f : File; highpos, lowpos : CARDINAL );
PROCEDURE GetPos ( VAR f : File; VAR highpos, lowpos : CARDINAL );
PROCEDURE Length ( VAR f : File; VAR highpos, lowpos : CARDINAL );
PROCEDURE Reset ( VAR f : File );
PROCEDURE Again ( VAR f : File );
PROCEDURE ReadWord ( VAR f : File; VAR w : WORD );
PROCEDURE WriteWord ( VAR f : File; w : WORD );
PROCEDURE ReadChar ( VAR f : File; VAR ch : CHAR );
PROCEDURE WriteChar ( VAR f : File; ch : CHAR );

(* The following declarations are only useful when programming
    or importing drivers *)

TYPE
  FileProc = PROCEDURE ( VAR File );
  DirectoryProc = PROCEDURE ( VAR File, ARRAY OF CHAR );

PROCEDURE CreateMedium
  ( mt : MediumType; mediumno : CARDINAL;
    fp : FileProc; dp : DirectoryProc; VAR done : BOOLEAN );

PROCEDURE DeleteMedium
  ( mt : MediumType; mediumno : CARDINAL;
    VAR done : BOOLEAN );

PROCEDURE AssignName
  ( mt : MediumType; mediumno : CARDINAL;
    mediumname : ARRAY OF CHAR; VAR done : BOOLEAN );

PROCEDURE DeassignName ( mediumname : ARRAY OF CHAR );

PROCEDURE ReadMedium
  ( index : CARDINAL;
    VAR mt : MediumType;
    VAR mediumno : CARDINAL;
    VAR mediumname : ARRAY OF CHAR;
    VAR original : BOOLEAN;
    VAR done : BOOLEAN );

PROCEDURE LookupMedium
  ( VAR mt : MediumType;
    VAR mediumno : CARDINAL;
    VAR mediumname : ARRAY OF CHAR;
    VAR done : BOOLEAN );

END FileSystem.
trijezdci commented 7 years ago

Please note that PIM defines WORD as the smallest addressable unit. In other words, PIM does not define any identifier called BYTE, but semantically, WORD is the same as BYTE. The reason for this goes back to the first implementation of Modula-2 on a PDP-11 where the smallest addressable unit was a 16-bit word.

Some PIM compilers follow PIM strictly and define BYTE as an alias of WORD, while setting the size of WORD to 8 bit on architectures where an 8-bit byte is the smallest addressable unit.

Some other PIM compilers are not actually in compliance with PIM because they modify the semantics of WORD to match the word size of the underlying processor even if that is not the smallest addressable unit, and they add a type BYTE that represents the smallest addressable unit instead of WORD as PIM calls for.

Both groups of compilers may have added ReadByte and WriteByte to FileSystem.

In the former group this would simply be a set of aliases for ReadWord and WriteWord.

In the latter group this would be an additional set of procedures with different semantics.

trijezdci commented 7 years ago

Lastly, yes indeed, M2M was PIM2-only.

You can also tell from the export list in the definition module. Only PIM2 allows exports in definition modules. PIM3 and PIM4 don't accept that. Only very very few compilers only supported PIM2, most got updated to at least PIM3. Thus PIM2 is not generally of any concern for cross platform work.

davidgiven commented 7 years ago

My Modula-2 knowledge is very small (despite living in Zürich --- ETH is just up the road!), so I can't contribute much here; but I'm completely up for the idea of modernising the ACK's Modula-2 standard library.

These days the ACK assumes you've got a Posix-style syscall interface --- the old EM trap mechanism is now used. I see that this is already bound to the Modula-2 library via the Unix and StripUnix modules. So it's probably possible to write a pure Modula-2 implement of FileSystem in terms of this. Ideally InOut and Streams would then be rewritten in terms of FileSystem but as they'd all use the same underlying syscall interface they'd interoperate happily without that. The total amount of work should be pretty small (for someone who knows Modula-2).

The man page for em_m2 references PIM3. How compliant to the standard is the ACK's compiler?

trijezdci commented 7 years ago

The FileSystem module of PIM is not really great design and as you can see from the two definitions I posted it also differs between PIM3 and PIM4. It is not recommendable to use FileSystem for any new development (unless it is a portability library like the one I wrote for our project).

However, it is beneficial for a PIM compiler to have at least the libraries:

With those, most legacy Modula-2 code written for the most popular Modula-2 compilers of the 1980s can either be built out of the box, or it can be easily adjusted without having to rewrite code.

It is for this reason that I am advocating adding an implementation of FileSystem to ACK.

Under different circumstances I'd be happy to write it myself and contribute it, but I am really busy trying to get our Modula-2 rev 2010 bootstrap compiler hosted on PIM/ISO for bootstrapping and do so before September. I already got side tracked having to produce a portability I/O library so the project will compile across PIM and ISO. Thus I am short for time and can't really take on such a task right now.

However, in the light of modernising ACK's library, you are welcome to incorporate our portability library into your distribution.

https://github.com/m2sf/m2pp/blob/master/src/BasicFileIO.def https://github.com/m2sf/m2pp/blob/master/src/BasicFileSys.def

The implementor of GNU Modula-2 has also shown interest in it and the library may ship with GM2 at some point. ACK is currently supported via the POSIX libs. It is listed specifically by its name in the build configuration script menu.

https://github.com/m2sf/m2pp/blob/master/cfg/config.sh#L145

I will do intensive testing on this when our compiler is ready for bootstrap as I intend to bootstrap it using all the compilers on the list (currently 14 or 15 or thereabouts).

This would then have the advantage that if people write code for ACK using the portability I/O library, they can easily move the code to any of the other compilers on the list, even to an ISO compiler since the library bridges the dialect divide as far as I/O is concerned (other gaps can be bridged using M2PP, the preprocessor).

trijezdci commented 7 years ago

As for the question how compliant is ACK with PIM3 ...

I have not done any specific testing for this but I read the various comments by the original developer (whose name escapes me right now, apologies for that) and from that I gather that he took great care to make ACK as close to the letter and spirit of PIM as possible.

This has led to better compliance in some cases, but from Wirth's point of view lesser compliance in others. The thing with Wirth is that he thinks something is obvious when often it is not. Judged strictly by language alone, PIM has many ambiguities, although Wirth would say it is obvious.

One such case is the case of WORD versus BYTE. On the one hand, PIM says that implementors are free to provide whatever they feel in pseudo-module SYSTEM. On the other hand, PIM clearly says that type WORD provided by that pseudo-module is required and it represents the smallest addressable unit of the target platform.

Now you could argue that if you are on a platform where the smallest addressable unit is 8-bits, you are at liberty to add a type BYTE to pseudo-module system that represents the smallest addressable unit, but are you also at liberty to redefine WORD to be something other than the smallest addressable unit?

Most PIM compilers added BYTE as smallest addressable unit and redefined WORD to be a multiple of BYTE matching whatever is the machine word of the target platform. This is strictly speaking not in compliance with PIM, but ACK is one of a very few compilers who stay strictly compliant in this particular case.

Another case is the question of whether a unary minus before an expression applies only to the first term of the expression (in line with mathematical convention) or whether it applies to the entire expression (as Wirth's EBNF grammar would suggest).

When I asked Wirth about this, he said it should be obvious from the EBNF grammar that the latter is intended, but PIM does not strictly require that Modula-2 be defined by an LL grammar and implemented with an RD parser. A table driven implementation with an LR or LALR or GLR parser would lose the attribute that Wirth says is obvious.

In this case ACK follows mathematical convention, not Wirth's intent, but PIM doesn't really say.

Yet, from reading the various comments by the original ACK developer, I got the impression that they took compliance more serious than many others, especially commercial compiler implementors.

davidgiven commented 7 years ago

That's good to know --- several of the ACK languages are so archaic as to be almost useless (e.g. I had to disable the Occam compiler because the language has changed so much that the ACK dialect is barely recognisable as Occam). But it sounds like the Modula-2 compiler can compile real code.

So, yes, it sounds like having a modern FileSystem module would be totally worth it.

Unfortunately, my Modula-2 is basically limited to examples/hilo.mod (which I now know doesn't even have the right filename), so I can't do it --- sorry! Also, I hate to say it, but we can't use the m2pp code, as the license is incompatible (LGPL vs the ACK's BSD). So it would need a new implementation from scratch for the ACK...

trijezdci commented 7 years ago

Why would the license of M2PP matter? You say you are using sed which is also not BSD license compatible, and you don't seem to have any issue with that. M2PP is a standalone utility. The license has no impact on the files it processes. As for re-distributing the portable I/O library. There is no impact either. First, there is no need to link to it, and second even if linked, the LGPL is not viral (unlike the GPL). This library is a standalone library that can be used by end users in order to build cross-dialect, cross-platform software that builds across multiple compilers. If they use it, the licensing of their own code is not impacted.

trijezdci commented 7 years ago

As for differences between Modula-2 dialects and compilers over time ...

I don't know much about Occam but from your description it sounds like the language had undergone very significant changes over time. Modula-2 did not.

The differences between dialects and compilers are rather few and small. For example, in PIM2 function SIZE() had to be imported from pseudo-module SYSTEM while in PIM3/4 and ISO it is a built-in that doesn't require any import. In PIM3/4 and ISO anything defined in a definition module is exported while in PIM2 identifiers must be explicitly exported using an export directive. In PIM the built-in constant NIL is compatible with function pointers while in ISO it is not and ISO defines a special constant NILPROC. In PIM casting syntax uses the type identifier as a function while in ISO there is an explicit CAST() function that needs to be imported from pseudo-module SYSTEM. In ISO a set literal must be preceded by the set's type identifier, in PIM this is optional. PIM did not strictly allow lowline _ in identifiers, but many compilers permitted it anyway, so does ISO. Also, ISO added concatenation for string constants, it added array and record constants, and alternative characters @ and ! for ^ and |. Those guys were still living in the 1960s believing that there would be platforms that don't have ^ nor |. ISO futher changed the name of the smallest addressable unit to LOC. ISO also defined dedicated pragma delimiters <* and *>, PIM uses pragmas embedded within comments (*$...*). Neither define any pragmas tough.

All those differences are minute, even though annoying when one is trying to write portable code.

Larger differences exist when it comes to coroutines. PIM specifies a crude coroutine system that resembles how coroutines were done in assembly in the 1950s, not really worthy of a high level language (you have to share data between coroutines using global variables etc). Also, some coroutine features described in PIM are so specific to the PDP-11 on which Modula-2 was first implemented that nobody ever implemented them (e.g. IOTRANSFER). ISO replaced the PIM coroutines with a completely different design.

There are also large differences when it comes to exception handling. PIM1 (not published as a book, but only circulated as a document at ETHZ) described exception handling which was considered useful only on the PDP-11 and thus the specification for exceptions handling was removed when the language report was to be published as second edition (PIM2). Thus PIM2/3/4 do not have exception handling. ISO added exceptions back in but using a completely different design from that of PIM1.

ISO also added generics and OOP but only as addendums to the standard and only one compiler (on the Mac) ever implemented those. They can safely be ignored altogether.

There are vast differences in the pragmas that compilers implement since neither ISO nor PIM define any pragmas. Also completely absent is conditional compilation which many compilers added but all using their own syntax so conditional compilation is not portable and cannot even be taken for granted. This is why we need M2PP and why it has to be portable across dialects and compilers.

When it comes to libraries, there are HUGE differences. Wirth was never concerned with libraries and I think he has not understood the importance of portable libraries to this day. For all practical purposes, PIM Modula-2 is essentially a language that came without a library and everybody is on their own to build one. Each compiler vendor/implementor thus designed their own libraries. If you want to write portable code for PIM, you either have to roll your own libraries from scratch or use the few "programming examples that proved useful at ETHZ"[1] listed in PIM.

[1] this is how Wirth describes the libraries listed in PIM. They are thus not meant to be part of the language specification in terms of establishing a standard. But in absence of any portable alternative they ended up being the only fallback option.

Initially, the ISO standardisation was meant to specifically address the absence of libraries. But soon they started messing with the language itself (which they should have left to later amendments of the standard) and they completely messed up the library. The ISO library is a huge pile of garbage. It is so bad that no human language (however offensive) exists to describe just how bad it is. Seriously.

In summary, any PIM code that refrains from using pragmas, conditional compilation and coroutines and provides at least an I/O compatibility layer on top of Terminal and FileSystem is fairly straightforward to adjust to compile across different PIM compilers.

trijezdci commented 7 years ago

FYI, one of the things on my to do list (albeit low priority) is a utility that would (1) remove all EXPORT directives from Modula-2 .def files, (2) remove any FROM SYSTEM IMPORT SIZE; imports and (3) replace any occurrences of SYSTEM.SIZE with SIZE.

This would pretty much allow automatic conversion of PIM2 sources to compile on PIM3 compilers, thus removing the need to support PIM2 in any compiler and library. With such a tool available, PIM2 support could just be removed from a PIM compiler without consequence.

Maintenance on PIM compilers like ACK should probably not concern itself with PIM2.

davidgiven commented 7 years ago

Re licensing: you suggested incorporating BasicFileIO and BasicFileSys from m2mm into the ACK standard library. That means that the ACK standard library becomes an derived work consisting of the existing BSD code combined with the m2mm LGPL code; the most restrictive licensing wins, so the only way the result could be distributed is under the terms of the LGPL. The linking exemptions don't apply here because the m2mm files couldn't be replaced by the end user.

It may be possible to distribute them with the ACK provided the LGPL code gets built into a different library, which the end user would have to explicitly select when linking their program, but that's a bit hairy and I'd rather not do that (right now all in the ACK distribution is BSD licensed and I'd like to keep it that way, for simplicity).

sed and other build tools aren't relevant here because they're not distributed with the ACK --- they're system tools.

trijezdci commented 7 years ago

First off, I did not suggest any linking, but distributing the sources along with the compiler as an alternative for end users who would like to write portable code.

Second, even if you did link to an LGPL library, the LGPL license does not "win" as you put it. You can even use LGPL libraries embedded in commercial closed source software binaries and you never need to publish your sources. You only need to make the sources to the LGPL libraries available if you modified them. This is the key difference between the LGPL and the GPL.

trijezdci commented 7 years ago

BTW, the portable IO library is a subset of sorts of our I/O library for Modula-2 revision 2010. Any work we do on R10 is BSD licensed, but anything for PIM and ISO is LGPL licensed. If you were to implement a front end for (at least the bootstrap kernel subset of) R10, then you could use the R10 I/O library which is BSD licensed. However, that library needs features that aren't available in PIM and ISO, for example extensible record types (adopted from Oberon) and some others.

trijezdci commented 7 years ago

R10 is a modernised version of Modula-2 extended with key features from Oberon (Wirth's own successor to Modula-2), plus some extras such as proper coroutines modeled on Lua's coroutines, type safe variadic arguments for procedures/functions and other facilities to make interfacing to and from C easier, and suffixed number literals (0FFH, 0377B, 0377C) have been replaced with C style prefixed ones (0xFF, 0b11111111 and 0uFF).

davidgiven commented 7 years ago

If you weren't intending that the file system library isn't included in the ACK standard library, that I'm not clear on what you are suggesting. Could you expand?

And, unfortunately, LGPL requirements are more complicated than you make out.

The LGPL allows you to use LGPL code in non-GPL products without needing to provide source provided the LGPL code is distributed in such a way that the end user can swap out the LGPL code for a customised version. Usually the way people do this is to compile the LGPL code into dynamic libraries which are loaded at run time, because this allows another dynamic library to be substitued.

The issue with the ACK is that it doesn't support dynamic linking at all. All code is statically linked, and is therefore unmodifiable. So, any binaries created by the ACK, which link against LGPL code, do not fall under the LGPL exception and therefore become undistributable the end user's source is distributed.

In addition, the ACK distribution itself may or may not contain such binaries, depending on whether you define .a files to be aggregates or derived works. Either way it would make distributing the ACK more complicated.

It would be possible, with care, for the ACK to distribute LGPL code as an additional library which the end user has to explicitly opt into. Unfortunately given the licensing requirements imposed by statically linking against LGPL code I'm not sure how useful this would be.

deevans commented 6 years ago

The FileSystem module in the PIM Appendix is that of Medos. I suspect FileSystem is in the appendix, where Streams is not, is because RT-11's was suitably described in the book. The Report offers even different, simpler examples.

Wirth states he expects M2 compiler writers to write what is most consistent for their system (architecture/OS), and for programmers to design their own portable interfaces. This is so as to not place undue limitations on either, and unburden the language from non-portable requirements, a lesson obtained from Pascal. PIM states InOut, RealInOut, LineDrawing, Mathlib0, and Streams are considered standard, but not rigidly. The other modules from the book and report, as well as a select offering from RT-11 and Medos in the appendix, are examples.

Therefore I would not consider FileSystem as missing, though it's availability could be useful.

PS What I'd enjoy more is a PIM4 compatible switch to the compiler. Complete PIM4 compilers are so rare, and the increased flexibility of INTEGER and CARDINAL, as well as a mod (and terminated string requirement) consistent with Oberon (and Pascal with mod), are far more useful.