markwatkinson / luminous

Accurate and powerful syntax highlighting library
http://luminous.asgaard.co.uk
GNU Lesser General Public License v2.1
50 stars 5 forks source link

[idea] Modify the Luminous codebase to follow PSR standards, be more Composer-compatible, and use PHP 5.3 #31

Closed J5lx closed 9 years ago

J5lx commented 10 years ago

While working with the Luminous codebase I noticed that It follows quite unusual code conventions, while the most PHP projects follow the PSR standards nowadays. In fact, this is what I got when I first loaded the LaTeX formatter into my highly PSR-optimized editor: Lots of errors in my editor There’s a PSR-related error on almost every line of code, or in other words: The Luminous code conventions are really different from what most other projects use.

Now I’d want to know what you’re thinking about modifying the Luminous code to follow those established standards. If you have no objections, I’d be happy to do the actual code change myself. I would also make this package more Composer-compatible and -dependent, since everybody uses this tool now and it makes things a lot easier (no more require statements for every single class to use!). And finally, I’d suggest to drop support for PHP 5.2 and move on to 5.3 since all popular hosting packages support it nowadays (at least I don’t know any package that doesn’t), so we can finally use things like closures :wink:

markwatkinson commented 10 years ago

PSR and losing 5.2 compatibility came up a year or two ago and I rejected it then because 5.2 accounted for a significant number of PHP installations. I think it is more popular than you suspect[1][2], but Luminous 0.7 and earlier all support 5.2 fine and as far as I'm concerned they're stable and mostly feature complete, so people stuck with 5.2 can use 0.7. So I am happy now to support 5.3 as minimum.

As for PSR compatibility itself - I haven't kept up with this at all (I barely use PHP now), but making Luminous more compatible with other projects is a good idea overall. Is there a linter or something I can run myself and get a summary of what the incompatibilities are?

[1] http://w3techs.com/technologies/details/pl-php/5/all [2] http://wordpress.org/about/stats/

J5lx commented 10 years ago

making Luminous more compatible with other projects is a good idea overall.

Okay, so I will start "porting" it to PSR soon.

Is there a linter or something I can run myself and get a summary of what the incompatibilities are?

I guess the tool you are searching for is PHP_CodeSniffer, it’s also the tool my editor uses. You can run it via phpcs --standard=PSR2 languages src. It can take a moment to sniff all files, so if you want to see how it progresses just add -p.

markwatkinson commented 10 years ago

Sounds good!

Thanks for the link to CodeSniffer, I'll check it out tonight or tomorrow.

On Tue, Aug 19, 2014 at 10:15 AM, J5lx notifications@github.com wrote:

making Luminous more compatible with other projects is a good idea overall.

Okay, so I will start "porting" it to PSR soon.

Is there a linter or something I can run myself and get a summary of what the incompatibilities are?

I guess the tool you are searching for is PHP_CodeSniffer http://git.io/vyWXQQ, it’s also the tool my editor uses. You can run it via phpcs --standard=PSR2 languages src. It can take a moment to sniff all files, so if you want to see how it progresses just add -p.

— Reply to this email directly or view it on GitHub https://github.com/markwatkinson/luminous/issues/31#issuecomment-52608743 .

J5lx commented 10 years ago

While unifying the code style I noticed that you used different conventions for commands in documentation comments: Sometimes commands are prefixed with \ and sometimes with @. What should be the standard? I would choose the second style, since it seems to be much more popular and is the standard in both JavaDoc and phpDocumentor, two well-known documentation generators for their respective languages. In fact, I never saw the first style before looking into the code of Luminous.

Additionally I will transform the

///
///
///

documentation comments into

/**
 *
 */

ones if you have no objections because the second style seems to be much more popular (Again, I didn’t even know the first style before looking into the code of Luminous); and as you can see GitHub doesn’t even highlight the first style as a doc comment.

markwatkinson commented 10 years ago

The docs are written for Doxygen (which a few years ago seemed to be better regarded than PhpDoc, I don't know if this is still the case, but they seem fairly interchangeable).

I agree with standardising on @ and multi-line doc comments as you suggest.

As you are finding, the current code is a bit of a mess in terms of conventions :) This is because the core scanning functionality underwent a huge change in architecture and therefore a big rewrite, whereas some of the other things (like the formatters) were adapted but not rewritten. So getting it all consistent with itself and overall community conventions is really great.

J5lx commented 10 years ago

I knew the doc was written for Doxygen, but since Doxygen supports both \ and @ and both /** and /// I asked. Regarding the popularity I have to say that Doxygen is again the first PHP project I saw which uses it. Other PHP projects I saw used phpDocumentor, Read the Docs or Sami for their documentation (though the second one doesn’t seem to be an API doc generator). However Doxygen seems to be a bit more popular among compiled languages to me.

J5lx commented 10 years ago

I just stumbled upon another problem. The classes Scanner and LuminousScanner are both defined in the same file (src/core/scanner.class.php) while the autoloading standard PSR-0 requires each class to be in a file on it’s own to work. In addition to this PSR-1 says that "Code written for PHP 5.3 and after MUST use formal namespaces [instead of pseudo vendor prefixes].". But when I remove the pseudo vendor prefix from LuminousScanner I get Scanner which already exists. So I have to rename one of the classes. Looking at the phrase "It is loosely based on Ruby's StringScanner." in the description of the Scanner class I would suggest to rename it to Luminous\Core\StringScanner and rename LuminousScanner to Luminous\Core\Scanner.

markwatkinson commented 10 years ago

Simple answer: yep, go for it.

Longer answer: The structure here is probably a bit dubious. IIRC Scanner is completely unrelated from the whole notion of syntax highlighting, it's just to help consume a string. So maybe Scanner and LuminousScanner don't really belong in the same namespace. If you've got any thoughts on renaming/restructuring the namespaces feel free to suggest them.

On Wed, Aug 20, 2014 at 6:19 PM, J5lx notifications@github.com wrote:

I just stumbled upon another problem. The classes Scanner and LuminousScanner are both defined in the same file (src/core/scanner.class.php) while the autoloading standard PSR-0 http://www.php-fig.org/psr/psr-0/ requires each class to be in a file on it’s own to work. In addition to this PSR-1 http://www.php-fig.org/psr/psr-1/ says that "Code written for PHP 5.3 and after MUST use formal namespaces [instead of pseudo vendor prefixes].". But when I remove the pseudo vendor prefix from LuminousScanner I get Scanner which already exists. So I have to rename one of the classes. Looking at the phrase "It is loosely based on Ruby's StringScanner." in the description of the Scanner class I would suggest to rename it to Luminous\Core\StringScanner and rename LuminousScanner to Luminous\Core\Scanner.

— Reply to this email directly or view it on GitHub https://github.com/markwatkinson/luminous/issues/31#issuecomment-52810694 .

J5lx commented 10 years ago

Just another question: In LuminousOptions::__set() self::check_type() is only called for a few options. Shouldn’t it be called for all?

markwatkinson commented 10 years ago

I think it is called for all options, it's just a bit indirect. e.g. for the boolean options it gets called in set_bool ( https://github.com/markwatkinson/luminous/blob/b9eff4aac6f93026ef64296a59ab05e76fbeeb7a/src/options.class.php#L284 ).

I agree that makes it unnecessarily hard to read though.

On Thu, Aug 21, 2014 at 12:20 PM, J5lx notifications@github.com wrote:

Just another question: In LuminousOptions::__set() self::check_type() is only called for a few options. Shouldn’t it be called for all?

— Reply to this email directly or view it on GitHub https://github.com/markwatkinson/luminous/issues/31#issuecomment-52907170 .

J5lx commented 10 years ago

Okay, so I will change the code to call it in a unified way.

markwatkinson commented 10 years ago

Ok, thanks :) Watch out for set_format, which accepts either a string or a LuminousFormatter object

On Thu, Aug 21, 2014 at 12:33 PM, J5lx notifications@github.com wrote:

Okay, so I will change the code to call it in a unified way.

— Reply to this email directly or view it on GitHub https://github.com/markwatkinson/luminous/issues/31#issuecomment-52908205 .

J5lx commented 10 years ago

I just discovered another strange thing. In src/debug.php LUMINOUS_DEBUG is hardcoded to be on. And since src/debug.php is always included by src/luminous.php which is always included by luminous.php which is included by the user it’s always set to true. In addition to that LUMINOUS_DEBUG == true just triggers some more exceptions which seem to be highly specific to a certain development situation to me, so I wonder if it’s really necessary or could be removed, also because it seems to be used quite rarely.

markwatkinson commented 10 years ago

The packaging script turns it off - if you download a zip from the website, it should have LUMINOUS_DEBUG == false. But actually I don't like this approach much because downloading a prepackaged zip is not really fashionable for php package distribution anymore (thankfully :)) so the assumptions between dev and production environments don't really work.

IIRC the flag is used to raise internal errors, which:

  1. in a dev/debug environment we definitely want to know about (and fix), but
  2. in a production environment might be non-fatal or even inconsequential, so keep it quiet and carry on

If I was doing it now, I'd probably favour some kind of light internal error logging and then make sure tests/dev environments always check that the log is empty.

So: I say leave it as it is for now, but I'll create a separate issue to remove it.

On Thu, Aug 21, 2014 at 3:38 PM, J5lx notifications@github.com wrote:

I just discovered another strange thing. In src/debug.php LUMINOUS_DEBUG is hardcoded to be on. And since src/debug.php is always included by src/luminous.php which is always included by luminous.php which is included by the user it’s always set to true. In addition to that LUMINOUS_DEBUG == true just triggers some more exceptions which seem to be highly specific to a certain development situation to me, so I wonder if it’s really necessary or could be removed, also because it seems to be used quite seldomly.

— Reply to this email directly or view it on GitHub https://github.com/markwatkinson/luminous/issues/31#issuecomment-52928880 .

J5lx commented 10 years ago

Okay. I’ll then just see how I can get that nasty require-statement out of the code. AFAIR composer has a feature to automatically load certain files independently from the class autoloader.

J5lx commented 10 years ago

BTW, I’m still working on this – At the moment I just don’t have as much free time as I had when I begun, so I make less progress now.

markwatkinson commented 10 years ago

Ok, thanks for letting me know :)

How much work is there left?

If you like, I'm happy to merge the work in progress into a new branch so it doesn't get lost. Up to you.

On Wed, Sep 17, 2014 at 1:40 PM, J5lx notifications@github.com wrote:

BTW, I’m still working on this – At the moment I just don’t have as much free time as I had when I begun, so I make less progress now.

— Reply to this email directly or view it on GitHub https://github.com/markwatkinson/luminous/issues/31#issuecomment-55887235 .

J5lx commented 10 years ago

Oh, I didn't notice your comment yet :o

I can't exactly say how much work is left due to the different coding conventions you used ("As you are finding, the current code is a bit of a mess in terms of conventions"). Some files require more work, some less. For example, src/core/scanners.class.php contains numerous classes to adjust while other files contain only one class (luckily :wink:).

Regarding the merge into a new branch I'm not sure yet.