symfony-cmf / seo-bundle

A SEO Solution for duplicate contents, page titles, etc.
https://cmf.symfony.com
47 stars 27 forks source link

Create XML Sitemap #132

Closed ElectricMaxxx closed 9 years ago

ElectricMaxxx commented 10 years ago

As we have access to the routes inside the PHPCR and the cmf handles routes it would be easy to create a xml representation of their structure.

But i would not only suggest to create a xml Stitemap, even a html representation would be doable. Maybe could add a configuration value for a template to render beautyfull lists of Routes, maybe with some previews to the content.

The xml-generation could be done by SonataSeoBundle. The other thing need be done by us.

dbu commented 10 years ago

the xml sitemap sounds very useful indeed. for a html sitemap, i am less sure - maybe we can just provide the thing that walks the phpcr tree in a way that it can be reused by somebody wanting to build the html sitemap. how to actually do that is probably very site specific (what to include / exclude in the sitemap, for example).

benglass commented 10 years ago

In this case we would want to add sitemap priority and a boolean exclude from sitemap to the seo metadata.

For an html sitemap I think that knp menu bundle is a better choice for the task (in other words I would advocate for sticking to just XML since that is directly relevant to SEO but a sitemap page on your site is less so and is already handled by knp menu).

benglass commented 10 years ago

I would also consider perhaps just providing instructions to users as to possible sitemap solutions as opposed to trying to implement one. The reason I say that is that there are 2 bundles out there already that provide sitemap function that I can think of SonataSeo and PrestaSitemap.

SonataSeo is somewhat limited because it is based on sql queries (unless they have updated it). This makes it a poor fit for cmf documents because writing a raw sql query that generates the url for a cmf document is difficult or not plausible.

PrestaSitemap is the bundle we chose because it is more flexibile with the ways you can populate the sitemap and provides the concept of "sections" which we use to implement multiple sitemaps for different websites in a multi-host solution.

This bundle could definitely provide the standardized ability to store sitemap related information for dynamic objects like sitemap priority and whether an object should be excluded from the sitemap.

benglass commented 10 years ago

Correction sonata admin is capable of generating sitemaps via services although it is not documented

https://github.com/sonata-project/SonataSeoBundle/blob/master/DependencyInjection/SonataSeoExtension.php#L100 http://sonata-project.org/bundles/seo/master/doc/reference/sitemap.html

ElectricMaxxx commented 10 years ago

Not only the admin bundle, the sonata-seo-bundle should provide some functions to show sitemaps

dbu commented 10 years ago

while doing this, we could also look into https://support.google.com/webmasters/answer/2620865?hl=en and #166 to provide language alternatives in the xml sitemap.

ElectricMaxxx commented 10 years ago

:+1:

Mit freundlichen Grüßen

Maximilian Berghoff


Maximilian Berghoff Wiesenstraße 44 91617 Oberdachstetten

Mail: Maximilian.berghoff@gmx.de Mobile: +49 151 64825096

On 10.07.2014, at 09:26, David Buchmann notifications@github.com wrote:

while doing this, we could also look into https://support.google.com/webmasters/answer/2620865?hl=en and #166 to provide language alternatives in the xml sitemap.

— Reply to this email directly or view it on GitHub.

ElectricMaxxx commented 10 years ago

Just some more questions:

  1. what should have some config?
  2. Include/Exclude some Routes
  3. global value for priority/chanfreq
  4. How to generate lastmod? Does our documents server some information about it?
  5. Extract/Generate priority from route/content?

Can do that with extractors, but as @WouterJ wanted to deprecate them, how to handle that else?

From the technical point of view, i wanted to use the sitemap stuff from SeoBundle, but this one only provide an abstraction to query a doctrine. So i will do by looping through all routes and create xml by using JmsSerializer.

ElectricMaxxx commented 10 years ago

Btw, using extractors will make no sense when looping through a list of routes. So we will need some mapping/configuration to get the information from the routes's content.

ElectricMaxxx commented 10 years ago

Came to the conclusion, that it will be better to start with the content and create the url from it by the help of the url generator. So we will have the power to generate the alternate urls too, as i did it in #175. I also think we should implement a provider mechanism to cover all possible databases or content sources. I would suggest to write a default provider to create a collection of Sitemap-Entries. So it is up to the implementation of the provider, how to create the properties of those Entries. By doing this one of the other providers could be to just use the sonata-seo-bundle way. The output of that list should depend on the content-type of the header. Usually google would request application/json, but why not serving a rendered Template when somebody requests text/html. Just one question: I would server a configuration for the url of the sitmap, any hints how to generate a Route from it directly in the bundles extension? Is there something like $container->addRoute()? (think so, right?)

dbu commented 10 years ago

asking to explicitly register the routing.xml file and define it in there has the benefit that its more visible and one could customize the url - though that would be a bad idea with google.

i think the provider idea makes most sense. there can be so different logics. maybe we need a visitor pattern or something? the visitor would be passed routes and metadata. then the metadata can be infered from the content of the route (e.g. a news does not change once its published, homepage updates often, ...) or from some manual data stored somewhere.

ElectricMaxxx commented 10 years ago

Got one little performance issue in my head: creating the Sitemap by looping through the content and just doing one UrlGenerator::generate($content) call per document, would cause N+1 queries, right? Should i relax that by querying the routes collection manually?

dbu commented 10 years ago

its going to be slow either way i fear. i guess the thing to do is provide a command to optionally dump the sitemap to the fs. to take load off the db.

but you can experiment with prefetching data - sometimes it helps. sometimes it also hurts more, so definitely try it first. and make it optional.

dbu commented 10 years ago

@ElectricMaxxx did you start any work on this? if not i will probably tackle this soon.

ElectricMaxxx commented 10 years ago

not jet just prepared the "many-function" for the alternate locale in the other PR.

ElectricMaxxx commented 10 years ago

@WouterJ i had a look into the KunstmannSitemapBundle (https://github.com/Kunstmaan/KunstmaanSitemapBundle), cause by the strong coupling to the orm we can't use the controller. Btw: there is no chance to hook into it. So the only think we could use would be the templates and the twig extensions. Enough for depending on that bundle?

wouterj commented 10 years ago

Unless the twig extension is very complex, -1 :)

dbu commented 10 years ago

or try to refactor the kunstmaan bundle to the point where it can do what we want. kunstmaan is not using the CmfRoutingBundle, or is it? if not, we would need to replace the whole part about url generation too. but if we can refactor and use a substantial part of their bundle then, i think it would be worth it.

dbu commented 10 years ago

oh, actually i am -1 now. looked at composer.json and they not only require their admin and "node" bundle but also fosuserbundle, which imo has NOTHING to do with a sitemap. unless they can provide a lot of valueable things and we find a way to fix (=remove) those dependencies, i doubt its worth it. maybe we can steal the concept. or make their bundle depend on the cmf bundle to eliminate their general code and only keep the integration with all the other stuff they seem to do :-)

ElectricMaxxx commented 10 years ago

... and the twig extensions arn't that poverful as they seemed to be. there are more extensions to hide a node then displaying one :-)

ElectricMaxxx commented 10 years ago

Conclusion what we planed in the comments above:

wouterj commented 10 years ago

I think you're missing some things:

ElectricMaxxx commented 10 years ago

thanks @WouterJ

why the listener? would it be enough if somebody create its custom route and map it to the controller/action we provide?

wouterj commented 10 years ago

why the listener? would it be enough if somebody create its custom route and map it to the controller/action we provide?

Yeah, but I don't like it to use routes to configure some feature. I use a listener for it most of the time

ElectricMaxxx commented 9 years ago

solved by #196