teran / mediawiki-googlerichcards

MediaWiki extension to generate Google Rich Cards metadata for article pages
GNU General Public License v2.0
7 stars 4 forks source link

Is it compatible with actual version of MediaWiki? #1

Closed giby closed 2 years ago

giby commented 6 years ago

Hello,

I tried your extention, but after installation, and cache clearing, it had no impact. The script was not present in the sourcecode.

teran commented 6 years ago

Hello @giby !

Thanks for using my extension! I'm using it with 1.31.0-rc.0 at the moment without any additional changes since 1.28, so, it's compatible.

Could you please describe exact steps you made to install the extension and behaviour you got? In manner like:

Thanks in advance.

giby commented 6 years ago

Thanks for such a quick answer!

I actually had an issue with finding the metadata in the text… Shame on me, I was looking for "first revision author" that was actually replaced by the actual value!

So it is working now…

I did a little "hack" on your code in order to add custom things…

Here is what it looks like:

<?php
/**
 * GoogleRichCards
 * Google Rich Cards metadata generator
 *
 * PHP version 5.4
 *
 * @category Extension
 * @package  GoogleRichCards
 * @author   Igor Shishkin <me@teran.ru>
 * @license  GPL http://www.gnu.org/licenses/gpl.html
 * @link     https://github.com/teran/mediawiki-GoogleRichCards
 * */
$wgExtensionCredits['validextensionclass'][] = array(
   'name' => 'GoogleRichCards',
   'author' =>'Igor Shishkin',
   'url' => 'https://github.com/teran/mediawiki-GoogleRichCards'
);

if ( !defined( 'MEDIAWIKI' ) ) {
  echo( "This is a Mediawiki extension and doesn't provide standalone functionality\n" );
  die(1);
}

function GoogleRichCards(&$out) {
    global $wgLogo, $wgServer, $wgSitename, $wgTitle;
    if($wgTitle instanceof Title && $wgTitle->isContentPage()) {
      $ctime = DateTime::createFromFormat('YmdHis', $wgTitle->getEarliestRevTime());
      $mtime = DateTime::createFromFormat('YmdHis', $wgTitle->getTouched());
      if($ctime) {
        $created_timestamp = $ctime->format('c');
      } else {
        $created_timestamp = '0';
      }

      if($mtime) {
        $modified_timestamp = $mtime->format('c');
      } else {
        $modified_timestamp = '0';
      }

      $first_revision = $wgTitle->getFirstRevision();
      if($first_revision) {
        $author = $first_revision->getUserText();
      } else {
        $author = 'None';
      }

      $image = key($out->getFileSearchOptions());
      if($image && $image_object = wfFindFile($image)) {

        $image_url = $image_object->getFullURL();
        $image_width = $image_object->getWidth();
        $image_height = $image_object->getHeight();
      } else {
        $image_url = $wgServer.$wgLogo; // Mediawiki logo to be used by default
        $image_width = 135; // Default max logo width
        $image_height = 135; // Default max logo height
      }

      $out->addHeadItem(
          'GoogleRichCards',
          '<script type="application/ld+json">
          {
             "@context": "http://schema.org",
             "@type": "Article",
             "mainEntityOfPage": {
               "@type": "WebPage",
               "@id": "'.$wgTitle->getFullURL().'"
             },
             "author": {
               "@type": "Person",
               "name": "CAHV",
             "url": "http://www.histoire-valenciennes-cahv.fr/",
                        "sameAs": "http://www.facebook.com/CAHValenciennes/",
                        "email": "info@histoire-valenciennes-cahv.fr"

        },
             "headline": "'.$wgTitle.'",
             "dateCreated": "'.$created_timestamp.'",
             "datePublished": "'.$created_timestamp.'",
             "dateModified": "'.$modified_timestamp.'",
             "discussionUrl": "'.$wgServer.'/'.$wgTitle->getTalkPage().'",
             "image": {
               "@type": "ImageObject",
               "url": "'.$image_url.'",
               "height": '.$image_height.',
               "width": '.$image_width.'
             },
             "publisher": {
               "@type": "Organization",
               "name": "'.$wgSitename.'",
               "logo": {
                 "@type": "ImageObject",
                 "url": "'.$wgServer.$wgLogo.'"
               },
                        "email": "info@histoire-valenciennes-cahv.fr",
                        "legalName": "Cercle Archéologique et Historique de Valenciennes et de son arrondissement"
             },
             "description": "'.$wgTitle->getText().'"
           }
           </script>');
         }
}

$wgHooks['BeforePageDisplay'][] = 'GoogleRichCards';

?>

It should not be that pain to add the few things I did (email, sameAs, legalName… )

Just a question: Why does the LocalSettings.php use a syntax "require_once "$IP/extensions/GoogleRichCards/GoogleRichCards.php";" and not wfLoadExtension( 'GoogleRichCards' ); ?

Have you also done anything if someone want to add a meta on a single page? I would like to add a ""@type":"Book"" tag…

teran commented 6 years ago

@giby could you please send this as a pull request, we could discuss this right there. Main vision about changes: if this works, doesn't break compatibility and corresponds RichCards spec - I'm all for it :)

Thanks!

giby commented 6 years ago

Ho… I would have to make a clean code for everyone then…

teran commented 6 years ago

Here's the thing :) So you are free to use your local "hack" locally at the moment. Btw, I've checked out Rich Snippets docs and couldn't find anything allows using email and sameAs for "@type": "Person" such as email and legalName for "@type": "Organization". Could you please post a reference on docs or specs allowing to use these fields in such way? If it's a proper way to do that so I could probably update an extension with allowing to add them via LocalSettings.php since it's static data anyway.

Have you also done anything if someone want to add a meta on a single page? I would like to add a ""@type":"Book"" tag…

I believe it should be done via dedicated templates or tags within an article(since you probably want to post a number of books, events, products, etc.) so it doesn't look like a part of this extension in it's current view and I have no objections to extend it such way, but no promises on doing that fast.

And I missed....

Just a question: Why does the LocalSettings.php use a syntax "require_once "$IP/extensions/GoogleRichCards/GoogleRichCards.php";" and not wfLoadExtension( 'GoogleRichCards' ); ?

Because it's much faster to do, wfLoadExtension looks for metadata files which should contain some build-time data like version for instance, which I couldn't set in any way automatically since there's no CI/CD stuff for this extension and manual update is kinda bug prone and makes no sense for that. ref: https://www.mediawiki.org/wiki/Manual:Extension.json/Schema

So I can't see any pros for wfLoadExtension, especially if you gonna copy the line out of README anyway. And cons are: it makes no sense without CI or manual operations which are bug prone.

giby commented 6 years ago

Giving you my source is easy…

Sameas: https://developers.google.com/search/docs/data-types/social-profile

<script type="application/ld+json">
{
  "@context": "http://schema.org",
  "@type": "Person",
  "name": "your name",
  "url": "http://www.your-site.com",
  "sameAs": [
    "http://www.facebook.com/your-profile",
    "http://instagram.com/yourProfile",
    "http://www.linkedin.com/in/yourprofile",
    "http://plus.google.com/your_profile"
  ]
}
</script>

Well email was in contact point: https://search.google.com/structured-data/testing-tool

I could not find "legalName"… I'm sure I found it on the same page… There's a long time I tried to implemant it… Or maybe here http://schema.org/legalName

I also saw something for initials for exemple "Cercle Archéologique et Historique de Valenciennes et de son arrondissement" is known as "CAHV"

If in some days it is working on google, I'll let you know! Actually, the guide I sent you provides a lot of other features… that I was not in need of…

Indeed, the description of "event" or "book" should be done manually… it isn't possible to make it "automatically" for each of them… I actually have no idea about how to proceed…

As talking about extention, I was feeling about adding:

 <script type="application/ld+json">
            {
                "@context":"http://schema.org",
                "@type":"ItemList",
                "itemListElement":[
                                   {
                                   "@type":"ListItem",
                                   "position":1,
                                   "url":"http://www.histoire-valenciennes-cahv.fr/Memoires/index.php/Memoires/issue/view/60"
                                   },
                                   {
                                   "@type":"ListItem",
                                   "position":2,
                                   "url":"http://www.histoire-valenciennes-cahv.fr/Memoires/index.php/Memoires/issue/view/59"
                                   },
                                   {
                                   "@type":"ListItem",
                                   "position":3,
                                   "url":"http://www.histoire-valenciennes-cahv.fr/Memoires/index.php/Memoires/issue/view/58"
                                   }
                                   ]
            }
        </script>

But have no idea how to do… I feel a OJS plugin would be more relevant than a mediawiki one…

I though that "wfLoadExtension" was the current syntax and that the other was depreciate… Well, I am basically a biologist and not a programer… I make a lot of R coding, and basic help to few project… and also build websites for an association organizing event and selling books about local history…

Wish you good night

teran commented 6 years ago

So the thing is you need to get this data somewhere from... MediaWiki doesn't support social profiles, so the only easy and reliable way I could imagine is to fetch User: page and look for some markup there.

About "@type": "event" or "@type": "book" so it depends... you normally need some data to render them, just a name is not enough, this data(such as author, ISBN, edition, etc.) could be provided as MediaWiki template to display the book/event properly on MediaWiki page and the same template you could use as a data source for rich snippet of "@type":"book" or "event". You will need to parse it properly and your task is done. More about templates: https://www.mediawiki.org/wiki/Help:Templates

I though that "wfLoadExtension" was the current syntax and that the other was depreciate…

Yes and no at the same time :) PHP have a set of functions(include, require, require_once, include_once) which are about to include another file during runtime. wfLoadExtension is a new way provided by MediaWiki itself they're trying to organise how developers are providing extensions and they're right of doing that, since they could handle improper extensions, check versions, etc. The thing here is about implementation and proper usage. But currently it makes no sense to use it in this extension, it would give no benefits at all.

PS. Please use triple backticks (```) for multiline code snippets. Thanks.

giby commented 6 years ago

It sound easy to add in the LocalSettings.php a line like: $Facebook ="…" Then a line in the code: if($Facebook not empty) then "sameAs": "…",

It does not depend on the user if it is the page related to the whole wiki.

I think templates does not fit with what I look for exemple, if I need to put:


            {
                "@context":"http://schema.org",
                "@type":"Book",
                 "image": "http://www.histoire-valenciennes-cahv.fr/Memoires/public/journals/1/cover_issue_58_fr_CA.jpg",
                "name" : "Cercle Archéologique et Historique de Valenciennes - Mémoires - Tome XI",
                "url" : "http://www.histoire-valenciennes-cahv.fr/Memoires/index.php/Memoires/issue/view/58",
                "workExample" : [{
                                 "@type": "Book",
                                 "potentialAction":{
                                 "@type":"ReadAction",
                                 "target":
                                 {
                                 "@type":"EntryPoint",
                                 "urlTemplate":"http://www.histoire-valenciennes-cahv.fr/Memoires/index.php/Memoires/issue/view/58",
                                 "actionPlatform":[
                                                   "http://schema.org/DesktopWebPlatform",
                                                   "http://schema.org/IOSPlatform",
                                                   "http://schema.org/AndroidPlatform"
                                                   ]
                                 },
                                 "expectsAcceptanceOf":{
                                 "@type":"Offer",
                                 "Price":20,
                                 "priceCurrency":"EUR",
                                 "eligibleRegion" : {
                                 "@type":"Country",
                                 "name":"FR"
                                 },
                                 "availability": "http://schema.org/InStock"
                                 }
                                 }
                                 }]
            }
        </script>```
on one page, the issue is being able to add a text in the code of the page to be translated like this in the final html …  
teran commented 6 years ago

It depends on what kind of object you're going to describe such way. Normally your approach would for for "Organization" only and that's why it's not so flexible for all the stuff. Unfortunately.

Here's template invocation from Wikipedia:

{{Cite book
| publisher = American Library Association; The British Library
| ISBN = 978-0-8389-0522-7
| last = Avrin
| first = Leila
| title = Scribes, script, and books: the book arts from antiquity to the Renaissance
| location = New York, New York
| year = 1991
| page = 83
}}

It's already contain most of data you need for rich snippet, so why it wouldn't fit for describing books? The idea here is to describe books via templates and then render that data on articles via MediaWiki itself and rich snippets via extension which is parsing this template(you're able to access page content via extension with specific hook, it's how GoogleRichCards works btw.)

teran commented 6 years ago

Btw, I just updated the extension with using php-json intead of hardcoded string which should ease your case with adding multiple items since you able now use required data structure to be converted to JSON without any attention to it's format and escaping.

giby commented 6 years ago

What I tried to was done following the book richsnipset from google…

I'll install the update… Let's see

teran commented 6 years ago

Hello @giby !

I've updated the extension in to cover some of your highlights:

I gonna polish the new refactored code for now, but you already could take a look how it's implemented and use the same approach for "@type": "Book", it should be pretty easy now.

Thanks for your highlights!

PS. I'll try not to stop adding new @type's, so all the rest will be added, but once again, no promises on doing that fast, sorry.

giby commented 5 years ago

Hi,

Here is a little feedback.

I used this code:

{{Event
|name=Conférence du Cercle Archéologique et Historique de Valenciennes et de son arrondissement 
|startDate=2018-10-28T15:00:00Z
|endDate=2018-10-28T17:45:00Z
|place=Musée des Beaux-Arts de Valenciennes
|description=Les impacts environnementaux de l’extraction du charbon dans le Valenciennois et le Borinage du XVIIIe siècle à l’entre-deux guerres

|postalCode= 59300
|locality=Valenciennes
|region=Haut de France 
|streetAddress=Boulevard Watteau
|country=FR
|performer= Kevin Troch
|validFrom=2018-10-28T15:00:00Z
|offerPrice=5 
|offerCurrency=EUR
|image=www.histoire-valenciennes-cahv.fr/images/thumb/0/0f/SAICOM_CHP_532_Hensies_1934.jpg/1200px-SAICOM_CHP_532_Hensies_1934.jpg
|offerURL=http://www.histoire-valenciennes-cahv.fr/images/thumb/0/0f/SAICOM_CHP_532_Hensies_1934.jpg/1200px-SAICOM_CHP_532_Hensies_1934.jpg
}}

I have got 4 warnings:

Spectacles et billets

Conférence du Cercle Archéologique et Historique de Valenciennes et de son arrondissement
4 avertissements
Champ "image" manquant (facultatif)
type
Event
name
Conférence du Cercle Archéologique et Historique de Valenciennes et de son arrondissement
startDate
2018-10-28T15:00:00Z
endDate
2018-10-28T17:45:00Z
description
Les impacts environnementaux de l’extraction du charbon dans le Valenciennois et le Borinage du XVIIIe siècle à l’entre-deux guerres
location
type
Place
name
Musée des Beaux-Arts de Valenciennes
address
type
PostalAddress
streetAddress
Boulevard Watteau
addressLocality
Valenciennes
postalCode
59300
addressRegion
Haut de France
addressCountry
type
Country
name
FR
performer
type
PerformingGroup
name
Kevin Troch
offers
URL non valide dans le champ "url" (facultatif)
Type de valeur non valide pour le champ "url" (facultatif)
Type de valeur non valide pour le champ "availability" (facultatif)
type
Offer
url
<a rel="nofollow" class="external free" href="http://www.histoire-valenciennes-cahv.fr/images/thumb/0/0f/SAICOM_CHP_532_Hensies_1934.jpg/1200px-SAICOM_CHP_532_Hensies_1934.jpg">http://www.histoire-valenciennes-cahv.fr/images/thumb/0/0f/SAICOM_CHP_532_Hensies_1934.jpg/1200px-SAICOM_CHP_532_Hensies_1934.jpg</a>
price
5
priceCurrency
EUR
availability
{{{offerAvailability}}}
validFrom
2018-10-28T15:00:00Z 

To be clear: Image missing and I cannot get url to be "valide" I tried several ways…

The google search engine confirmed:

Type de valeur non valide pour le champ "url"

Type de valeur non valide pour le champ "availability"

Champ "image" manquant

URL non valide dans le champ "url" 

Is it that I put wrong or something missing?

I tried to modify it for getting "book" , but so far; I haven't succeed.

Thanks in advance.

teran commented 5 years ago

@giby let’s check it out:)

Could you please post the resulting JSON/provide a link to that page?

giby commented 5 years ago

hmmm… not sure what a json/provide is…

The page I'm playing with is:http://www.histoire-valenciennes-cahv.fr/index.php?title=Les_impacts_environnementaux_de_l%E2%80%99extraction_du_charbon_dans_le_Valenciennois_et_le_Borinage_du_XVIIIe_si%C3%A8cle_%C3%A0_l%E2%80%99entre-deux_guerres

I wonder if it isn't the \/ instead of / that is the issue

giby commented 5 years ago

the \/ instead of /

teran commented 5 years ago

@giby

The page I'm playing with is:

the link is enough :)

So these two:

Type de valeur non valide pour le champ "url" Type de valeur non valide pour le champ "availability"

are because of really strange behaviour of providing URL as a HTML-formatted link. I gonna check it out.

Champ "image" manquant

This one is true, that's because image is not supported based on the code I just checked out, so I'm wondering where you got a field for it? At the same time that happened because there's no clear UX understanding how to add image for that so I skipped it for now in alpha-staged Event extension, but could easily add in the way you used it(but still not sure if that's a right UX expirience for it).

giby commented 5 years ago

Thanks @teran,

Let me know when it would be working for I download it.

teran commented 5 years ago

@giby btw, I just got idea, which editor are you using for your mediawiki installation?

teran commented 5 years ago

@giby https://github.com/teran/mediawiki-GoogleRichCards/releases/tag/v0.3.1 is gonna fix your case but it's kinda dirty solution so this needs some refactoring.

Please let me know on any issues, thanks for your feedback! :)

giby commented 5 years ago

@teran editor? I used filezilla, it isn't an editor

as editor I use either Xcode or gedit

teran commented 5 years ago

@giby don't worry, I had a suspicion about how the response was malformed, but it wasn't confirmed so nevermind :)

giby commented 5 years ago

Great :)

Now just 2 warnings left: -missing image field -invalid value type for availlability

for the last I made a "|offerAvailability=http://schema.org/InStock" in order to obtain " "availability": "http://schema.org/InStock"," as in the exemple here: https://search.google.com/structured-data/testing-tool

Else; I get a small error in the main page: https://search.google.com/structured-data/testing-tool#url=http%3A%2F%2Fwww.histoire-valenciennes-cahv.fr%2Findex.php%3Ftitle%3DCercle_Arch%25C3%25A9ologique_et_Historique_de_Valenciennes_et_de_son_arrondissement

there is an extra "," in

teran commented 5 years ago

-missing image field

You need to update Template:Event to fix that it was changed recently.

-invalid value type for availlability

I've fixed that in v0.3.2, please take a look how to use it in README.

there is an extra "," in

I've tried to reproduce that on the most recent MediaWiki release, but with no success. Could you please try to disable all the extensions you have enabled except GoogleRichCards to verify this malformed JSON came from GoogleRichCards ?

giby commented 5 years ago

All fixed!

Now; I'll try to setup the "book" hack!

giby commented 5 years ago

remote: Permission to teran/mediawiki-GoogleRichCards.git denied to giby. fatal: unable to access 'https://github.com/teran/mediawiki-GoogleRichCards.git/': The requested URL returned error: 403

teran commented 5 years ago

@giby You need to fork the repository first to your account, then create a branch and propose it as a pull request. The error you got from GitHub is mostly possible happened because you tried to push changes to my instance of mediawiki-GoogleRichCards repository while normally should push to your own one. Please refer GitHub help pages for more details: https://help.github.com/articles/proposing-changes-to-your-work-with-pull-requests/

giby commented 5 years ago

Well, I created a branch first, let me fork that!

giby commented 5 years ago

@teran Fork done!

giby commented 5 years ago

@teran ? Any idea of what is wrong?

teran commented 5 years ago

@giby wrong with what? :-)

If you done fork to your GitHub account that's right step, so you need to start your developing there(preferably in dedicated branch) and when it's done - create pull request(there's gonna be a button called "compare" or something like that).

giby commented 5 years ago

@teran my code crashes, and I don't know why…

Here is an issue:

public static function onParserFirstCallInit(Parser &$parser) {
    global $wgGoogleRichCardsAnnotateEvents;
    if($wgGoogleRichCardsAnnotateEvents) {
      $event = Event::getInstance();
      $parser->setHook('event', [$event, 'parse']);
    }
  }
 /*   public static function onParserFirstCallInit(Parser &$parser) {
        global $wgGoogleRichCardsAnnotateBooks;

        if($wgGoogleRichCardsAnnotateBooks) {
            $book = Event::getInstance();
            $parser->setHook('book', [$book, 'parse']);
        }
    }*/

I define twice the same function, but, when I tried to correct it it was still crashing…

teran commented 5 years ago

@giby posting code inline even with backticks disables formatting making it unreadable. It's really hard to understand what's going on in the code and how it differs from already present parts. Please create a pull request(doesn't matter if it's crashing or doesn't work in any other way).

giby commented 5 years ago

I added equivalent of public static function onParserFirstCallInit(Parser &$parser) for a page of 'book' that duplicates a function that led to a crash…

I don't know how to make a correct function for it…

teran commented 5 years ago

@giby please, just share the code in readable way(pull request) it's not an issue to take a look and say what's wrong but it's not supposed to be done via comments.

In the sample above I see some stuff is commented out there, but I need the whole file to check it out correctly. Please create pull request.

giby commented 5 years ago

See my fork… So the begining of book.php is : here

and

my hook that crashes is here

See better?

teran commented 5 years ago

@giby I've created PR from your code at #3 and commented right there about issues. Let's move discussion related to code there because it allows to comment right by code lines which will make the conversation much more specific.

giby commented 5 years ago

Can you have a new look? it isn't crashing now, but it is not functionnal …

teran commented 5 years ago

@giby Obviously it's not functional, you haven't used your own implementation :-) Please refer to comments at #3 it's covered there.

teran commented 2 years ago

Closing because of inactivity.