some-programs / exitwp

Exitwp is tool primarily aimed for making migration from one or more wordpress blogs to the jekyll blog engine as easy as possible.
686 stars 145 forks source link

Support extracting "Feature Image" #73

Open steren opened 6 years ago

steren commented 6 years ago

Thanks a lot for this tool.

Today, Wordpress allows writers to attach a "Feature Image" to each post. This image is for example used as an image preview when listing posts, or as an image header when viewing the post.

I looked at my wordpress.xml file, and it seems that the image is captured as the following:

The post <item> contains this metadata:

  <wp:postmeta>
    <wp:meta_key>_thumbnail_id</wp:meta_key>
    <wp:meta_value><![CDATA[1089]]></wp:meta_value>
  </wp:postmeta>

Which references another <item>, which has the same ID: <wp:post_id>1089</wp:post_id>

The URL of the image is stored in the <wp:attachment_url> attribute and/or in the <guid isPermaLink="false"> attribute.

Is this something that exitwp could support?

LorenzBischof commented 5 years ago

A bit hacky, but it works:

      thumbnail_id = i.findall(ns['wp']+'postmeta/['+ns['wp']+'meta_key="_thumbnail_id"]
      attachment_url = None
      if thumbnail_id: 
          thumbnail_id = thumbnail_id[0].find(ns['wp']+'meta_value').text
          attachment = c.findall('item/['+ns['wp']+'post_id="'+thumbnail_id+'"]')
          if attachment:  
              attachment_url = attachment[0].find(ns['wp']+'attachment_url').text

You need to add it somewhere in the parse_header() function and return it in the export_item object. Then add it to the yaml_header.