Closed ceefour closed 6 years ago
I'm unclear as to why you are opening bug reports for all these items; nothing that you are reporting are actual bugs.
Relex currently does not do any logging. It has a dependency on apache commons only indirectly, because one of the other packages requires it; I forget which.
What would we log, and why?
Relex is an open-source project, the only things that get "accepted" or "rejected" is good code and bad code. You haven't explained what the point of doing this would be, or what benefits it offers.
Again, keep in mind that the long-term goal is to get away from Java, and have a single, unified processing engine inside of opencog. This is still a long, long ways away, so relex will have to stay in 'maintenance mode' indefinitely. But we have no plans extend or expand the Java code relex; the plan is to eventually port the relex-algs.txt files over to opencog.
GitHub Issues feature is used for all kinds of tickets, be it improvements, new feature, or bugs. There is "Labels" feature so you can tag a ticket as improvement, new feature, or bug. (Check this https://github.com/l0rdn1kk0n/wicket-bootstrap/issues?page=1&state=closed for a typical colorful example of GitHub Issues usage)
Logging is used to replace System.out.println()
in a convenient, non-intrusive, configurable, and performant way.
RelEx already does logging, but I guess not using the log library. For example, this code from LocalLGParser
:
if (verbosity > 0)
{
Long now = System.currentTimeMillis();
Long elapsed = now - starttime;
System.err.println("Parse setup time: " + elapsed + " milliseconds");
}
if (verbosity >= 5) System.err.println("Done with parse");
Should've been:
Long now = System.currentTimeMillis();
Long elapsed = now - starttime;
log.debug("Parse setup time: {} milliseconds", elapsed)
log.info("Done with parse");
Now the code is clearer and I argue better. A new person can understand "ok this is log for debugging, and that one is information", which is clearer than verbosity level 0, 1, 2, 3, 4, 5, and so on.
Code is shorter, no "if"s, because the "if" is done by logging configuration. So whether RelEx is used as a library, run inside IDE, run test from console, or inside server, everyone is happy because if they need better control of (i.e. change verbosity) then can configure it.
It doesn't affect performance since there's no string concatenation i.e. the log string isn't generated if the log for that level is disabled.
By default log is to console, and you can specify verbosity level. But can also use file, to HTML, send to another server/database, or to a GUI tool as below (which I like, since it gives visual indication with colors what things are doing what). Without changing the Java code to switch config. (again this is all optional, so there's nothing to install if you don't use it.)
SLF4J isn't a big library and it's just like commons-logging (which SLF4J is strictly compatible, BTW).
Regarding long-term goal, sure. I'd like to contribute to the project since it seems I need it (or parts of it) for my thesis. If I see something that I think can be improved, without impacting negatively to existing use cases, I'll suggest it. You can review my suggestions and see if it makes sense. I'm all for explaining the rationale behind them, when you're interested.
If someday RelEx matured and a new non-Java version of RelEx is developed that integrates better with OpenCog. Sure, no problem. At least during this time, I can help making RelEx-Java development less painful. For those coming to Java and RelEx the first time, one may think "OMG Java is hard... RelEx is hard... NLP is hard! AI is hard!" Sure these things are not easy, but if there are easier ways to deal with them, why not. NLP is much harder if one writes everything by themselves, that's why we use OpenNLP to extract sentences and the like, so we can focus on the real hard part that our project is doing.
My hope is that... say a person gets hired or there's a GSoC student working on RelEx, I'd hate to see him/her a bit unmotivated because they need to download commons-logging, opennlp, and other dependencies. These are unnecessary hurdles, we already have the solution for this case, Maven. They checkout the project and do mvn install site
, all dependencies downloaded, project built, tests run, get a nice HTML report. Win. :)
It's just an example, concretely the merits should be discussed on a case by case basis. As a lot of my assumptions are certainly wrong.
Even if in the long term RelEx will move on from Java, it will still be a great learning experience for me, much more than if I create my own project. Yes I also create my own project but my goal is to integrate with OpenCog components including RelEx, whenever possible. If I can contribute improvements, whether that's new code or removing code, that'd be even more awesome.
I do the work for improvements and lowering the barrier of entry, so another person will not need to experience the barrier of entry, and can jump as quickly as possible to the real work, as you said, is natural language processing.
Reopening without comment.
Logging things are implemented so can be closed probably.
Maven migration is a good thing but it should be a separate issue. Raised https://github.com/opencog/relex/issues/267
closing per last comment
SLF4J is the de facto logging API in Java, with more convenient API & enhanced performance than JUL or the old apache commons logging.
The actual logging backend is flexible, Logback is great, but SLF4J can work with Log4j and many others.
If accepted, you can assign to me.