milesj / decoda

A lightweight lexical string parser for BBCode styled markup.
MIT License
196 stars 52 forks source link

Respect List BBCode syntax #22

Closed remontees closed 12 years ago

remontees commented 12 years ago

Hello,

Why you don't respect the official syntax of BBCode List ? Source : http://forums.phpbb-fr.com/faq.php?mode=bbcode#f3r0

remontees

milesj commented 12 years ago

PHPBB is not official BBCode syntax, it's just a forum that wrote their own code for it.

Decoda is written the way it is to support its lexical parser.

remontees commented 12 years ago

Invision Power Board use this syntax too, and many others services !

milesj commented 12 years ago

The only thing different is [*] which I will not support.

remontees commented 12 years ago

It may be interessent to support a syntax knowned by many users !

remontees commented 12 years ago

Why ? I'ts may be interessent to respect a syntax used by millions of persons ?

2012/9/20 Miles Johnson notifications@github.com

The only thing different is [*] which I will not support.

— Reply to this email directly or view it on GitHubhttps://github.com/milesj/php-decoda/issues/22#issuecomment-8710676.

milesj commented 12 years ago

Because they use regex parsing while I use lexical parsing, and frankly, it's not easy to add support for that kind of tag into the lexical parser. Secondly, how powerful is [*]? From the looks of it, not so much. It wouldn't handle nested lists or tags efficiently.

remontees commented 12 years ago

But you can do modes, with support who wants the user. If you want an idea for the code, I've a code who does not function very well.

milesj commented 12 years ago

Not sure what you are talking about, but feel free to show me an example.

remontees commented 12 years ago

Here is the code :

 function liste($m)
 {
if( strpos($m[2], '[*]') !== false ) // Si au moins une puce existe entre [list] et [/list]
{
    $type = $m[1] == '1' ? 'ol' : 'ul'; // Type de la liste

    // On créé un array contenant les éléments de la liste -> explode()
    // On supprime les éléments de la liste vides avec array_map() et trim() combiné à array_filter
    $items = array_filter(array_map('trim', explode('[*]', $m[2])));

    // On ajoute simplement à chaque éléments les éléments li
    foreach( $items as &$item )
    $item = '<li>' . $item . '</li>';

    // On retourne le tout en appliquant un implode sur l'array des listes
    return '<' . $type . '>' . implode(" ", $items) . '</' . $type . '>';
}
return $m[0]; // Si y'a pas de puces dans la liste, on remplace rien et on retourne la chaîne
}

/* 
 @param $string : la chaîne qui doit subir les modifications
 @param $o : pattern de l'élément ouvrant (les méta-caractères doivent-être echappés)
 @param $c : pattern de l'élément fermant (idem, echapper les méta-caractères)
 @param $callbackFunction : function de rappel qui sera appliquée sur chaque chaînes comprises entre $o et $c
*/
function recursiveReplaceElements($string, $o, $c, $callbackFunction)
{
// Captures des tags et leurs positions dans la chaîne
if( !preg_match_all('`(?:' . $o . '|' . $c . ')`', $string, $m, PREG_OFFSET_CAPTURE) )
return $string;

// Construction d'un array à 2 dimensions regroupant chaque blocs de 1er niveaux
// Et leurs niveaux d'imbrications
$i = 0;
$level = 0;
$last = null;
$arrPieces= array();
foreach( $m[0] as $values )
{
    // $values[0] => tag ouvrant|fermant
    // $values[1] => position du tag dans la chaîne

    // Tags ouvrants
    if ( strpos($values[0], '/') === false )
    {
        if( $last == 'start' )
        $level--;
        $arrPieces[$i][$level]['start'] = $values[1];
        $last = 'start';
    }
    else
    {            
        if( $last == 'end' )
        $level++;
        $arrPieces[$i][$level]['end'] = $values[1];
        $last = 'end';
    }
    if( $level == 0 && $last == 'end' )
    {
        ksort($arrPieces[$i]);
        $i++;
    }
}
// Pas autant de tags fermants que d'ouvrants...
if( $level !== 0 )
{
    return $string;
}

$lengthCloseTag = strlen(stripslashes($c)); // longueur du tag fermant

// Remplacements des éléments de sous-niveaux dans chaque niveaux supérieurs
$elementsTopLvl = array();
$replacementsTopLvl = array();
foreach( $arrPieces as $key => $blocks )
{
    $lastPiece = $lastPieceReplacement = null;

    foreach( $blocks as $lvl => $pos )
    {
        // Extraction du bloc de la chaîne
        $piece = substr($string, $pos['start'], $pos['end'] + $lengthCloseTag - $pos['start']);

                // Remplacements
                if( $lastPieceReplacement !== null ) // Si éléments de niveaux inférieurs
                    $lastPieceReplacement = preg_replace_callback('`^' . $o . '(.+)' . $c . '$`s', 'liste', str_replace($lastPiece, $lastPieceReplacement, $piece));
        else // Niveau le plus profond
            $lastPieceReplacement = preg_replace_callback('`^' . $o . '(.+)' . $c . '$`s', 'liste', $piece);
            $lastPiece = $piece;

        if( $lvl == 0 )
        {
            $elementsTopLvl[] = $piece;
            $replacementsTopLvl[] = $lastPieceReplacement;
                }
        }
}
$string = str_replace($elementsTopLvl, $replacementsTopLvl, $string);
return $string;
}
milesj commented 12 years ago

That's using regex which I do not use.

remontees commented 12 years ago

Why ?

milesj commented 12 years ago

Because regex is the worst possible route to take when creating parsers. I use lexical parsing: http://en.wikipedia.org/wiki/Lexical_analysis