SWI-Prolog / issues

Dummy repository for issue tracking
7 stars 3 forks source link

JPL toString regression bug #76

Open Wouter1 opened 6 years ago

Wouter1 commented 6 years ago

Previously, atoms like 'a\'b' printed correctly (as 'a\'b')

But jpl 7.6.4 gives 'a'b' ,so without properly escaping special characters inside the atom

Junit test showing the failure:

        Atom atom = new Atom("a'b");
        assertEquals("'a\\'b'", atom.toString());
Wouter1 commented 6 years ago

In Atom.java, I see this code

    public String toString() {
        return (JPL.isSimpleName(name) ? name : "'" + name + "'");
    }

I think the bug is here: just adding single quotes is not the proper way to escape strings.

Wouter1 commented 6 years ago

Previously we had something like this (this comes from Compound, just to illustrate why this bug appeared) where the conversion to string was done by SWI

        return ((Atom) (new Query(new Compound("sformat", new Term[] { new Variable("S"), new Atom("~q"), new Compound(".", new Term[] { new Atom(this.name), new Atom("[]") }) }))).oneSolution().get(
            "S")).name;
JanWielemaker commented 6 years ago

Seems to be part of a massive commit by Paul in 2015. I don't know why. Possibly performance? Possibly simplicity? Anyway, I guess this code should either call Prolog or implement atom serialization correctly. There is JavaScript code doing such in https://github.com/SWI-Prolog/packages-pengines/blob/master/web/js/pengines.js (near the end). I'm not a Java programmer and I figure you or Paul do that in 10% of the time it would cost me. I'll attend Paul on this thread and wait for the pull request ...

anionic commented 6 years ago

In Fred Dushin's original code, Atom.toString() simply yielded the (unquoted, unescaped) text of the Atom. The org.jpl7.Term classes implement an alternative concrete syntax for Prolog, suiting programmed use, avoiding all quoting/escaping issues. You might say they make the "teletype" syntax redundant ;-)

The Term.toString() methods are not an essential part of the JPL API, and are there largely because it's a Java convention for a class to have such.

I toyed with reimplementing them all to deliver sourcetext which read() would understand (if we're going to have toString() it might as well do something useful) but wasn't happy calling into Prolog (format or whatever) for this, and I regard these as convenience gadgets, not core ones...

Is there a clear contract for what Wouter assumes? e.g. Term.toString() yields a text which read() converts to a variant of whatever Term.put(term_t) puts in the term ref? Will there be issues with atoms which are defined as operators? Mightn't it be simpler to just implement Term.toString, passing any term to format?

JanWielemaker commented 6 years ago

I have little opinion about this. Serializing data such that Prolog read/1 reads it back as intended is remarkably hard though and better left to Prolog itself. Writing an atom doesn't depend on whether or not it is an operator, but having an atom as part of a term may require additional spaces or parenthesis. For example, you do not have to quote && to be read as an atom, but if it is the last part of a term, you need to use a space before the fullstop (.) or you can quote the atom or but it in parenthesis. E.g., these are all valid and read as the atom &&:

&& .
'&&'.
(&&).

From @anionic's comment my guess is that Atom.toString() should simply return the plain atom text. As it is defined now it suggests to do proper Prolog serialization, but it doesn't. The alternative is to do it properly. That can be achieved using the code in Pengines I pointer at (ported to Java), calling format/3 or, avoiding the full round trip to Prolog. using PL_write_term(). Doing it all in Java is probably the cleanest and the most resource friendly solution, but also requires most code.

I guess it depends on @Wouter1's use case.

Wouter1 commented 6 years ago

@anionic

In Fred Dushin's original code, Atom.toString() simply yielded the (unquoted, unescaped) text of the Atom. The org.jpl7.Term classes implement an alternative concrete syntax for Prolog, suiting programmed use, avoiding all quoting/escaping issues. You might say they make the "teletype" syntax redundant ;-)

From the code I don't get that impression. Why would they check if it needs quotes if that was the approach? It more looks like a quick attempt to get rid of the JPL call that was there. And yes, the toString was very expensive, that was probably the reason. We avoid it if possible at all. But sometimes you need it and then it needs to be correct.

Is there a clear contract for what Wouter assumes? e.g. Term.toString() yields a text which read() converts to a variant of whatever Term.put(term_t) puts in the term ref? Will there be issues with atoms which are defined as operators? Mightn't it be simpler to just implement Term.toString, passing any term to format?

IMHO there should be some function that delivers me a term that is correct, in the sense that I can feed it back into the parser to get back my original term.

@JanWielemaker

Serializing data such that Prolog read/1 reads it back as intended is remarkably hard though and better left to Prolog itself.

Exactly. Especially if you consider the possibility that infix operators can be re-defined (not something that we do here though) So a easy solution would be to plug in the original code (like the stuff that I quoted).

From @anionic's comment my guess is that Atom.toString() should simply return the plain atom text.

If a new function is provided that gives the correct output, that's ok, I can change my code to use that other function. It's quite common that toString gives something only for Debugging purposes. I think this should be part of the jpl code, not something we have to implement.

But as it is now, there is no correct working alternative.

Our use case: we call toString on all kind of terms (queries, database formulas) , for many reasons. We use prolog as a dynamic database of changing knowledge, it changes fast and we runs lots of queries over them. Does that answers the question?

Wouter1 commented 6 years ago

As a sidenote, there are multiple ways to string-ify a prolog term.

1 A quick way would be to ignore the fixity and always use functional notation.

2 A nice looking way would use all fixity and do a nice layout.

In our use case, many terms are directly shown to the user, so 2 would have my first priority. But I see use cases for 1 also, eg for internal handling of terms (eg serialization, transmission) and debugging.

Wouter1 commented 6 years ago

(sorry, wrong button. :)

anionic commented 6 years ago

How about implementing Term.toString using PL_write_term, delete Atom.toString etc. and let the subclasses inherit Term's? I agree that the current Atom.toString is worse than useless, nobody in their right mind can be depending on its (broken) behaviour. We already have Atom.name returning the unescaped text. Jan - is PL_write_term public and stable? I can't find anything in online docs.

Historical footnote from Atom.java 1.0.1 May 1999:

public java.lang.String
toString()
{
    return name_;
}

public java.lang.String
debugString()
{
    return "(Atom " + toString() + ")";
}
JanWielemaker commented 6 years ago

PL_write_term public and stable

It surely is public. It is also stable. Just, I never bothered to document most of the IO stream functions of SWI-Prolog. I should take a day or so to do so. PL_write_term provides all that write_term/3 is doing. The other IO functions allow creating a stream that points at a malloc/realloc memory area. The only disadvantage is that if you have the Java shadow term you first have to materialize it in Prolog, serialize it and discard the whole thing.

If desired, I can provide a little C snippet that serializes a Prolog term to a C string without calling Prolog (except when portray rules are involved).

Wouter1 commented 6 years ago

Sure, just implement the root and let it be used by the inherited classes if possible.

I don't know about PL_xxx functions, I always use the JPL stuff from Java and SWI on the other end. So I would just plug in the old code. But if someone knows more about the performance and overhead aspects and L_write_term AND about how JPL manages these, please go ahead.

anionic commented 6 years ago

Looking at write_term's options, it's not clear to me that there is a representation (even the write_canonical style) which will satisfy all JPL users' expectations, and to do anything constructive will just invite more issues. If we do something, it oughta be our Last Word on the subject.

I have never been worried about tuning JPL (other than some designed-in caching), and can't easily recompile jpl.c, so am tempted to implement Term.toString as e.g. (untested)

return Query.oneSolution("format(atom(A),'~k',[?])", new Term[] {this}).get("A").name();

with a suggestion in the docs that if folks want something different they can easily roll their own.

JanWielemaker commented 6 years ago

Fine with me. Use string(A) if possible. Also ~q might be a better default.

Wouter1 commented 6 years ago

I would at least try to avoid the string parsing, as it was done in the old way

Something like

static Compound queryterm=new Compound("sformat", new Term[] { new Variable("S"), new Atom("~q"), new Compound(".", new Term[] { new Atom(this.name), new Atom("[]") }) }));

toString() { ... return ((Atom) (new Query(queryterm).oneSolution().get("S")).name; }

JanWielemaker commented 6 years ago

Sure. Wasn't there a List class these days? Otherwise I'll leave it all to you and at most apply a pull request.

Wouter1 commented 6 years ago

List class? I'm not sure what you are referring to. There is a List class but it's a java "built-in" class. But I guess that's not what you mean.

It's not clear to me now who you think should pick this up.

anionic commented 6 years ago

Wouter - I'd be thrilled if you fixed this!

Jan suggests ~q instead of ~k, and string(A); I suggest you don't need the (Atom) cast or the new Query (see my example).

I suppose we oughta have a JUnit test for this?

Wouter1 commented 6 years ago

I'll check with my boss if I can do that. I work on a project that needs this in week 33 so if I can do it, it will be that week.

I can use Query.oneSolution instead of new Query().

The original term used ~q, so yes I'd use that. I don't know why you used ~k. nor what that means.

I suppose we oughta have a JUnit test for this?

Yes I'd like junit tests but are there junit tests for jpl (where?)? If there are already existing tests I can add them.

Wouter1 commented 6 years ago

BTW I have OSX 10.12.2. Does latest SWI compile smoothly with that?

JanWielemaker commented 6 years ago

AFAIK it compiles smoothly on all MacOS versions. You do need either Homebrew or Macports to install the dependencies though.

anionic commented 6 years ago

Add unit tests to org.jpl7.test.TestJUnit

-k uses write_canonical, which doesn't require the operator definitions prevailing at read time to be those in force at write time (safer but verbose and low-level).

-q uses writeq; more human-friendly, but could be caught out by e.g. @/1 which is fy 200 in swipl-win but not an op in the console swipl (wherein XPCE isn't loaded by default).

Wouter1 commented 6 years ago

ok so I understand we fix toString() using this

format(string(S), '~q', term).

Looking into it now.

Wouter1 commented 6 years ago

@JanWielemaker is the idea that I clone the main branch of https://github.com/SWI-Prolog/packages-jpl , fix and then create a pull request ? Or do I commit straight to here?

Wouter1 commented 6 years ago

I can't find the documentation on how to build JPL. I have the packages-jpl checked out from git. Now what? I tried

make Makefile.mak

but I get

make: Nothing to be done for `Makefile.mak'".

make -f Makefile.mak

gives me

Makefile.mak:12: *** missing separator. Stop.

make all Makefile.mak gives me

make: *** No rule to make target `all'. Stop.

Also in the first line of the makefile.mak I see this

!include ..\..\..\..\src\rules.mk

actually there is no such directory, am I supposed to check out more than just this package?

I don't see a configure script. The README.MaxOS only says how to install oracle SDK and some bug report about the java installer? There is also a Makefile.mak inside

JanWielemaker commented 6 years ago

JPL is built as part of the overall make process. That means you need to checkout and build the entire system. For Windows, get yourself a Linux machine and see README.mingw. These days it might also work using the Linux subsystem for Windows. Never tried.

Wouter1 commented 6 years ago

FYI my boss agreed that I do some work here. But my time right now was re-allocated to a different project. So my work on this is postponed. Current plan is that I can work on this in week 35.

JanWielemaker commented 6 years ago

@JanWielemaker is the idea that I clone the main branch of https://github.com/SWI-Prolog/packages-jpl , fix and then create a pull request ? Or do I commit straight to here?

Yes. Probably the easy way out is to develop and debug under Linux. Once things are merged, the nightly build will create a Windows binary.

I have an open project to create a Docker image for compiling the Windows version. This is complicated though as one cannot do an unattended download and install of the Oracle JDK for license reasons ...

anionic commented 6 years ago

7.6.4 builds fine under Ubuntu installed (and updated/upgraded) in Windows Subsystem for Linux (WSL) in Windows 10 Pro 64-bit, but won't cross-compile with README.mingw because wine1.6 is needed but won't install:

$ sudo apt install wine1.6
Reading package lists... Done
Building dependency tree
Reading state information... Done
Some packages could not be installed. This may mean that you have
requested an impossible situation or if you are using the unstable
distribution that some required packages have not yet been created
or been moved out of Incoming.
The following information may help to resolve the situation:

The following packages have unmet dependencies:
 wine1.6 : Depends: wine1.6-i386 (= 1:1.6.2-0ubuntu14.2) but it is not installable
           Recommends: cups-bsd but it is not going to be installed
           Recommends: gnome-exe-thumbnailer but it is not going to be installed or
                       kde-runtime but it is not going to be installed
           Recommends: fonts-droid but it is not installable
           Recommends: fonts-liberation but it is not going to be installed
           Recommends: ttf-mscorefonts-installer but it is not going to be installed
           Recommends: fonts-horai-umefont but it is not going to be installed
           Recommends: fonts-unfonts-core but it is not going to be installed
           Recommends: ttf-wqy-microhei
           Recommends: winetricks but it is not going to be installed
E: Unable to correct problems, you have held broken packages.

Is this a 32-bit/64-bit issue?

JanWielemaker commented 6 years ago

There is probably some way without wine as wine is a Windows emulator for Linux. Seems a bit of an overkill to try and load that on top of the Linux emulator for Windows ... Anyway, I guess this is a one/few line patch, so why not simply make a pull request or send me the proposed patch and I'll build it along with the normal builds.

Wouter1 commented 6 years ago

No problem I think , I'll find a linux machine.

The patch probably is a few lines but I will also create some junit tests and that will be more.

Wouter1 commented 6 years ago

Just a short report to let you know about problems with git that others might also have when trying to "quickly" fix something in SWI

I clicked on the fork button to fork the project into my workspace https://github.com/Wouter1/swipl-devel

I kept a "busy" image indefinitely. After some 15 minutes I pressed refresh button and all seemed fine. But when I now clone the project to my work machine the download at about 30% first drops to 10kB/s and then hangs up with "The remote end hung up unexpectedly". I hope this works out quick as my time to work on this is quickly eaten away my available time...

JanWielemaker commented 6 years ago

That is not normal. Guess guthub is busy ... Unfortunatly you do not need that either. To develop and create a pull request for a package you need:

Now

and create a pull request on github. Hope I didn't make a mistake. Note that the above is if you want the whole system so you can build and test it. If you just want a pull request for packages-jpl you can simply fork that, clone, branch and push.

Wouter1 commented 6 years ago

okay, the system manager checked the machine and that problem showed to be caused by some weird misconfiguration of the network interface on that machine. So no problem with git :-)

Wouter1 commented 6 years ago

ok some more steps are needed, the error messages are clear but maybe these should be mentioned on http://www.swi-prolog.org/build/unix.html {{{ There is no 'curl' -- please install it! There is no 'autoconf' -- please install it! }}}

and when I run make distclean

"Command make not found"

This linux machine doesn't even have gcc on it either.

Wouter1 commented 6 years ago

When I run ./prepare (first step after git clone of swipl-devel), I get a prompt if I want to run git submodule update. I suppose "Y" because I need packages/jpl.

But then comes a prompt for my username and password for github.com. I can do my username but not my password. Besides I don't want to clone all packages.

I abort. The packages/jpl directory is still empty

I guess I need to fork the packages-jpl.git on github as indicated above and only run the ./prepare AFTER THAT?

And also, I can run ./prepare only once? After abort, it seems to behave differently.

Wouter1 commented 6 years ago

Ok next try, as you suggested but with different answers to get around problems.

git clone https://github.com/SWI-Prolog/swipl-devel.git
cd swipl-devel
./prepare and answer defaults

"do you want me to run git submodule update:" answer: N "could not find documentation. ..." answer: Warn only

Fork packages-jpl.git on github cd packages/jpl git remote add myfork https://github.com/Wouter1/packages-jpl.git git checkout -b mybranch

It just says "switched to a new branch "mybranch"

And jpl directory remains empty.

Wouter1 commented 6 years ago

I tried also adding a "mybranch" branch in my fork of the packages-jpl and then redid all the above. makes no difference. Then I also did a 'git pull' but that gives "there is no tracking information for the current branch".

Wouter1 commented 6 years ago

I'll try now just downloading the tarball of swi prolog from http://www.swi-prolog.org/download/devel and building from that.

after that I install the gcc compiler and run the steps in "using the build.templ script" on build/unix.html.

it fails missing zlib. After installing that too, ./build finally seems to work.

I also see warning library(readline) NOT FOUND. Don't quite get that one, "libreadline7 is already the newest version (7.0.3)" I see also gmp missing but "libgmp10 is already the newest version (2:6.1.2+dfsg-2)"

No clue what's going on there.

I try ./build again, some warnings however, one is

Warning: library (jpl) ...................... FAILED. First 3 dots are green and then the rest of that line is red.

If I scroll through the VERY long list of messages, one line seeems suspicious

/use/bin/install: cannot stat '/home/insy/Desktop/swipl-7.7.19/src/../packages/jpl/examples/prolog/jpl_midi_demo.pl': no such file or directory

Wouter1 commented 6 years ago

@JanWielemaker is there a way so that I can just work a few minutes on this single line in the code to be fixed and on some junit tests, instead of spending hours on getting this whole system checked out and built? I imagine jpl could run all by itself, as all it supposedly needs is a SWIPL.dll ? Coming from maven-style projects, I expect just a few steps :

  1. check out the packages-jpl
  2. run make/build/mvn test/whatever and check that all junit tests are OK of this package
  3. add more junit tests that show the broken piece of code
  4. do the actual fix of the jpl code
  5. run make and check the new junit tests are also OK
  6. commit packages-jpl

I think it would be of great value for your project if it could be reshaped to facilitate developers that way.

JanWielemaker commented 6 years ago

You can't as the built requires building the JNI interface, etc. The Makefile is generated and that works only out of the box in combination with the rest of Prolog. Well, if you really know what you are doing you probably can. The easy way though is to first install the entire system from the git source as I explained. Just you need to say 'Y' to install the packages. Yes, you can avoid that but it only saves a few minutes CPU and requires the necessary editing of configuration files, so why would you?

Before running the `./prepare', install the packages as mentioned on the website at http://www.swi-prolog.org/build/Debian.txt

Wouter1 commented 6 years ago

@JanWielemaker But the JNI interface is just something linked against the SWIPL.dll ? So the package could build that JNI interface itself (which is exactly what it should do anyway)?

Just you need to say 'Y' to install the packages

But I tried that already, see above. I don't have all these other packages in my github space. Nor can I give my github credentials to an unknown script. What's the best alternative that can do without github credentials?

JanWielemaker commented 6 years ago

The package configuration and build infrastructure relies on the configured infrastructure of Prolog itself, so you can only build the whole lot. That takes only a few minutes. Surely you can avoid some of that if you really want, but it only makes life complicated.

That is why you must first clone the main repo from SWI-Prolog. prepare will clone all packages from the SWI-Prolog repo. after that you link the repo you want to edit to your fork using git remote add ...

Wouter1 commented 6 years ago

@JanWielemaker

prepare asks for my git password and I don't trust this script. So this route won't work.

JanWielemaker commented 6 years ago

You should clone https://github.com/SWI-Prolog/swipl-devel.git. That should not ask for any passwords.

Wouter1 commented 6 years ago

@JanWielemaker Yes of course. I'm talking about the 3rd step.

git clone https://github.com/SWI-Prolog/swipl-devel.git
cd swipl-devel
./prepare and answer defaults
Wouter1 commented 6 years ago

I ran out of time on this, I have not been able to compile SWI from sources. This is unfortunate as the fix for the actual bug is almost trivial. Maybe this can be re-assigned to someone who already knows how to compile SWI

JanWielemaker commented 6 years ago

Guess building requires some git and Linux experience ...

If you send me the proposed test I'll apply it and you can pick up the result next day from the daily builds to test it.

Wouter1 commented 6 years ago

@JanWielemaker

These are the junit tests I think minimally needed


    @Test
    public void testEscapesInAtomToString() {
        Atom atom = new Atom("a'b");
        assertEquals("'a\\'b'", atom.toString());
    }

    @Test
    public void testEscapesInCompoundToString() {
        Compound comp = new Compound("p'q");
        assertEquals("'p\\'q'", comp.toString());
    }

These are JUnit tests but I can't seem to find these. Maybe the idea is that they are put in TestJUnit.java but without the "@Test" annotation.

Maybe SWI escapes quotes as '' instead of \' , in that case you would need to modify the test a little.

Regarding the fixes of JPL, I hope I spot them all here which is a bit tricky without compiler support but I think 2 fixes is all we need... I'm basically rolling this back some code to version JPL 6.6.6 but adapted for the current code in jpl master

* JPL.java: replace quotedName

protected static String quotedName(String name) { return ((Atom) (new Query(new Compound("sformat", new Term[] { new Variable("S"), new Atom("~q"), new Compound(".", new Term[] { new Atom(name), new Atom("[]") }) }))).oneSolution().get("S")).name; }



Can you let me know if this works?
JanWielemaker commented 6 years ago

@Wouter1 Thanks. Sorry for the delay. Pushed https://github.com/SWI-Prolog/packages-jpl/commit/56d4c050214429469e099379b370c3d815a73f03 on the toString branch.

Instead of using sformat/3 (deprecated) and avoiding even more complex query generation I added a predicate to jpl.pl. This compiles and passes the (new) tests. If you agree I'll merge this into master.

Wouter1 commented 6 years ago

@JanWielemaker thanks for the work on this! Great to hear the jjunit tests works. I see you use isSimpleName after all, is that for performance? Otherwise I'd leave that to SWI as well.

If you really want to keep it, I'd suggest to drastically simplify it as it seems way too complex for what it seems to do to me - it's simpleName if it starts with a-z and then 0 or more _,A-Z, a-z, 0-9?.

How about this instead

static final String simplepattern="[a-z][_A-Za-z0-9]*";

protected static boolean isSimpleName(String s) { 
 if (s==null) throw new NullPointerException();
 return s.matches(simplepattern);
}

Also, in that case I'd suggest more junit tests to check if that also works properly.

I'm already on to the next project so I probably can't test this quickly. Where can I find the compiled codes just in case I have some time?