singanamala / google-blog-converters-appengine

Automatically exported from code.google.com/p/google-blog-converters-appengine
Apache License 2.0
0 stars 0 forks source link

wordpress2blogger gives me "not valid XML" error #45

Open GoogleCodeExporter opened 8 years ago

GoogleCodeExporter commented 8 years ago
What steps will reproduce the problem?
1. Get the tarball of r-79
2. cd bin
3. ./wordpress2blogger wordpress.2009-12-20.xml

What is the expected output? What do you see instead?
I'm expecting an XML of the converted blog to the standard output, but
instead I get this error Input WordPress document is not valid XML!!

Error appears around line 725, column 25

<wp:comment_author_url>http://CVirus.Foolab.org</wp:comment_
------------------------^

I attached the xml file.

Original issue reported on code.google.com by cvir...@gmail.com on 23 Dec 2009 at 8:01

GoogleCodeExporter commented 8 years ago
WordPress is somewhat notorious for generating invalid XML for their export 
files.  It's 
difficult to handle each type of error individually from the tool, so I cleaned 
up the 
problem areas in the document for you.  Do a diff on the two files to see what 
corrections were necessary.  

You should be able to convert this attached file.  Let me know if you have any 
issues.

Original comment by jlu...@google.com on 23 Dec 2009 at 8:17

GoogleCodeExporter commented 8 years ago
Thanks a lot, it was converted properly but now when I try to import it into 
blogger
I get "There were problems importing the file." :-(

Original comment by cvir...@gmail.com on 24 Dec 2009 at 7:21

GoogleCodeExporter commented 8 years ago
Actually this is what I'm getting "Sorry, the import failed due to a server 
error.
The error code is bX-qm5h6h"

Any clue ?

Original comment by cvir...@gmail.com on 27 Dec 2009 at 12:02

GoogleCodeExporter commented 8 years ago
[deleted comment]
GoogleCodeExporter commented 8 years ago
Sure, anyone having problems with invalid XML who have no experience in editing 
XML, 
please attach a copy of your wordpress export file and I'll clean it up for 
you.  Usually it 
takes small edits, but I guarantee that all the content will stay intact. 

Original comment by jlu...@google.com on 10 Mar 2010 at 4:50

GoogleCodeExporter commented 8 years ago
I'm moving a blog from wordpress to blogger. Before I've done it successfully. 
Now,
I'm trying to import it again, but I am experiencing an error even the 
converted file
using wordpress2blogger utility is less than 100kb. Please help me... Thanks.

Original comment by jjayk.wa...@gmail.com on 19 Mar 2010 at 2:56

GoogleCodeExporter commented 8 years ago
@jjayk It appears as though the bug that was causing your conversion to fail 
has been 
fixed.  Please try to convert that file using wordpress2blogger.appspot.com and 
let me 
know if you're still having problems.

Original comment by jlu...@gmail.com on 23 Mar 2010 at 3:02

GoogleCodeExporter commented 8 years ago
I'm also moving from Wordpress to Blogger.  Attached is the exported Wordpress 
file 
which is under 1MB.  When I use the Wordpress2Blogger conversion tool, 
following is 
the error message I get repeatedly.  Will you help as you indicated in Comment 
5?

Error encountered during conversion.

Input WordPress document is not valid XML!!

Error appears around line 6351, column 8

<wp:ping_status>closed</wp:ping_status>
-------^

Original comment by 3daybl...@gmail.com on 2 Apr 2010 at 12:10

GoogleCodeExporter commented 8 years ago
Sure thing.  The file was indeed invalid XML.  It had to do with embedded CDATA 
segments overlapping.  I've fixed the original file and converted it to a 
Blogger format 
for you as well.  Both are attached to this issue.

Try to import the file in Blogger and update this issue if you still are 
experiencing 
problems.

Original comment by jlu...@gmail.com on 2 Apr 2010 at 3:50

GoogleCodeExporter commented 8 years ago
Can anyone help? I'm getting the 'Input WordPress document is not valid XML!!' 
error.

Thanks!

Original comment by alyndasl...@gmail.com on 4 Apr 2010 at 3:01

Attachments:

GoogleCodeExporter commented 8 years ago
I've fixed the invalid XML in the document and attached a Blogger converted 
file as well.  

Update this issue if you have any problems moving this new file to Blogger.

Original comment by jlu...@gmail.com on 4 Apr 2010 at 4:02

Attachments:

GoogleCodeExporter commented 8 years ago
Hi, I too am facing a similar problem. It says XML file is invalid in the first 
line itself. Please also tell, what the actual problem is with the xml file. 

Would be very much thankful. I was trying to convert my wordpress to blogger. 

Original comment by friend.s...@gmail.com on 17 Jun 2010 at 8:12

Attachments:

GoogleCodeExporter commented 8 years ago
For this file, friend.sunit, just remove the first line (which is a blank 
line), and the last line (which is a <script>...</script> element).  After 
that, this file will be proper XML and should convert for you just fine.

Hope that helps.  Let me know if you continue to see problems with the 
conversion.

Original comment by jlu...@google.com on 17 Jun 2010 at 2:05

GoogleCodeExporter commented 8 years ago
Hi, I am having the same issue using the WordPress2Blogger tool as the rest of 
these people who have posted. 

My error reads: 

(beginning of error)

Input WordPress document is not valid XML!!

Error appears around line 2114, column 2

-^

(end of error)

Can you please help me fix my file so I can convert it to Blogger?

Thanks!

..k..

Original comment by kej1...@gmail.com on 18 Jun 2010 at 1:23

Attachments:

GoogleCodeExporter commented 8 years ago
Sure thing.  This XML problem was related to one post that uses a CDATA 
section.  The WordPress exporter puts all post comments in a CDATA section as 
well, and nested CDATA sections are not proper XML.

I've attached the cleaned version of the WordPress file and the converted 
Blogger file.

Hope that helps.  Let me know if you have any more problems.

Original comment by jlu...@google.com on 18 Jun 2010 at 1:42

Attachments:

GoogleCodeExporter commented 8 years ago
Hi, i dont knw if u can help here, I have the same problem as with others which 
is invalid xml data. I want help but another issue i'm having is that my 
wordpress exported WXR is over 1MB precisely around 1.6MB. Is there anyway i 
could get this converted for me?

the wordpress WXR file is attached is attached

Original comment by bosunola...@gmail.com on 8 Jul 2010 at 7:48

Attachments:

GoogleCodeExporter commented 8 years ago
Are you able to help me also??  I have the same trouble as the others...

Original comment by ahesto...@gmail.com on 23 Jul 2010 at 2:41

GoogleCodeExporter commented 8 years ago
Jlu, are you still helping people with this?? please help me lol.  Thank you!!!

Original comment by ahesto...@gmail.com on 13 Aug 2010 at 9:31

GoogleCodeExporter commented 8 years ago
After a short hiatus, I'm back to helping out.  If anyone still has invalid XML 
problems, please let me know and I'll try to help out.

Original comment by jlu...@gmail.com on 18 Aug 2010 at 2:56

GoogleCodeExporter commented 8 years ago
Hi jlueck!  I have switched to blogger, and have moved quite a few of my posts 
over.  I only need a couple of months worth of posts/comments to go.  Is that 
possible?  My file was a little over the limit anyway, so perhaps that is good 
that I have done a few manually.  The problem with doing that, is that the 
comments date is pre-set to current date.  

I have had a similiar problem as the previous commenters had.  

Original comment by ahesto...@gmail.com on 19 Aug 2010 at 8:07

Attachments:

GoogleCodeExporter commented 8 years ago
This wordpress file is actually valid if you remove the HTML document attached 
at the end of it (after </rss>).

I've removed this part and uploaded just in case.

Original comment by jlu...@google.com on 19 Aug 2010 at 4:34

Attachments:

GoogleCodeExporter commented 8 years ago
Thank you!  I will give it a try today!  You are very kind to offer such 
help... :D

Original comment by ahesto...@gmail.com on 19 Aug 2010 at 9:17

GoogleCodeExporter commented 8 years ago
On mine, there was no HTML code after the </rss> tag. My Error appears around 
line 83, column 54

ATA[<![CDATA[Mr WordPress]]>]]></wp:comment_author>
-----------------------------^

Could you please take a look at it and explain a bit as to why is this 
happening?
And yes, really appreciate your work on wp2blogger!

Regards
Varun

Original comment by varun.dhanwantri@gmail.com on 30 Aug 2010 at 6:46

Attachments:

GoogleCodeExporter commented 8 years ago
@varun

This is happening because of the nested <![CDATA[]]> section.  You cannot have 
a CDATA section inside of another one.  If you remove the inner CDATA section 
for the <wp:comment_author> elements, it should all run correctly.

Original comment by jlu...@google.com on 15 Sep 2010 at 4:51

GoogleCodeExporter commented 8 years ago
Error encountered during conversion.

Input WordPress document is not valid XML!!

Error appears around line 4926, column 8

<pubDate>Fri, 07 May 2010 19:41:28 +0000</pubDate>

how correct this mistake? thanks)))

Original comment by 7liberta...@gmail.com on 23 Sep 2010 at 10:28

GoogleCodeExporter commented 8 years ago
Are you still helping users with their invalid xml issues?

Original comment by zillas...@gmail.com on 21 Nov 2010 at 6:18

GoogleCodeExporter commented 8 years ago
[deleted comment]
GoogleCodeExporter commented 8 years ago
Is there any way to programmatically remove the nested CDATA?  I have attempted 
to do it by hand but have had no luck so far, as there appears to be just a ton 
if it in the fairly large export I'm converting.  Is it more common in older or 
newer builds of wordpress?  

Original comment by n...@buraglio.com on 26 Nov 2010 at 8:16

GoogleCodeExporter commented 8 years ago
I believe I've removed all of my nested CDATA, bu tI'm still seeing an arror as 
non-valid xml.  The error follows: 
Input WordPress document is not valid XML!!

Error appears around line 129711, column 391

<item>
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
----------------------------------------------------------------------^

Unfortunately the file seems to be too large to attach (11M).  Has anyone seen 
this before?  Any advice?   

Original comment by n...@buraglio.com on 1 Dec 2010 at 5:04

GoogleCodeExporter commented 8 years ago
I'm pretty much at a loss on this one unless someone else can point me at what 
nested CDATA I'm missing in the file......anyone?  Anyone?  Bueller? 

Original comment by n...@buraglio.com on 6 Jan 2011 at 7:29

GoogleCodeExporter commented 8 years ago
[deleted comment]
GoogleCodeExporter commented 8 years ago
Hi, I try to convert my xml file but i get the following problem: 

Error encountered during conversion.

Input WordPress document is not valid XML!!

Error appears around line 10642, column 229

<wp:post_type>post</wp:post_type>

I attach my file, could you please clean it in order to let me know import it 
in my new blogger blog?

ciao

Original comment by taurinor...@gmail.com on 23 Jan 2011 at 3:36

Attachments:

GoogleCodeExporter commented 8 years ago
The problem was in an XML entity called – that wasn't defined.  I've replaced 
that with a proper dash ('-') character and converted it.  Hope that helps.

Original comment by jlu...@google.com on 24 Jan 2011 at 3:50

Attachments:

GoogleCodeExporter commented 8 years ago
you are an hero, thanks very much!

Original comment by taurinor...@gmail.com on 25 Jan 2011 at 10:49

GoogleCodeExporter commented 8 years ago
[deleted comment]
GoogleCodeExporter commented 8 years ago
Hey victorbyte.

I had to replace some null characters with '\0' in your cpp examples to get 
this to work.  I've attached the cleaned wordpress export and the converted 
Blogger file.  Hope that helps.

Original comment by jlu...@google.com on 8 Feb 2011 at 5:09

Attachments:

GoogleCodeExporter commented 8 years ago
[deleted comment]
GoogleCodeExporter commented 8 years ago
[deleted comment]
GoogleCodeExporter commented 8 years ago
Hello, I read that you are correcting wordpress XML's that are showing up as 
invalid. I have no knowledge of XML's but I'd love to be able to import this 
into blogger. Thanks for your help

Original comment by thereall...@gmail.com on 27 Mar 2011 at 5:23

Attachments:

GoogleCodeExporter commented 8 years ago
Hello, May I get assistance in do mine as well.  Thanks in advance for your help

Original comment by rashadin...@gmail.com on 27 Mar 2011 at 8:06

Attachments:

GoogleCodeExporter commented 8 years ago
Here is a cleaned XML file for thereall...@gmail.com

The problem was nested CDATA sections.  Wordpress will add a CDATA section for 
the content of any post, and some of the posts contained Javascript that also 
had CDATA sections.  I've removed the nested CDATA sections from the Javascript 
and it should be good to go for conversion.

Original comment by jlu...@gmail.com on 28 Mar 2011 at 4:41

Attachments:

GoogleCodeExporter commented 8 years ago
Wow thanks! Your my hero. I was able to convert it without any issues. I can't 
seem to get the Blogger importer to work though as I get a server error (code 
bX-sz920t) but I imagine this is a Google issue.

Thanks again for your help!

Original comment by ian1...@gmail.com on 28 Mar 2011 at 9:55

GoogleCodeExporter commented 8 years ago
Hi... This is the same person as thereallesliestar@gmail.com. I appreciate your 
help with converting my XML to be "blogger ready" but I'm having no luck 
getting it through the blogger import tool. I've been getting a new error code 
that nobody else has gotten. 

I've re-exported my wordpress blog using different settings and am wondering if 
the smaller file will yield better results. Unfortunately it gets the invalid 
XML error and I just don't know how to fix it. If you have the time, could you 
fix the attached xml. I greatly appreciate your efforts. 

Original comment by ian1...@gmail.com on 29 Mar 2011 at 4:21

Attachments:

GoogleCodeExporter commented 8 years ago
Hi I want to convert my wordpress xml to blogger but I ran into some problem. 
The exported xml from wordpress is at 1.1MB so I downloaded the 
http://code.google.com/p/google-blog-converters-appengine/ package and 
correctly set-up the proper environment (py2.6 + gdata inside virtualenv). 
After  I run ./bin/wordpress2blogger.sh <xml-here> it says Invalid XML error on 
some line. Can you please help me with these. Attached is my xml file.

Original comment by rbbeltra...@gmail.com on 5 Apr 2011 at 5:50

Attachments:

GoogleCodeExporter commented 8 years ago
Here is a cleaned version of your wordpress file.  This file had the same type 
of nested CDATA entry stuff that makes it invalid XML as other files in this 
issue.  It has come up now in a number of wordpress exports that I've seen so 
I'll try to submit a bug to the developers of Wordpress.

Enjoy.

Original comment by jlu...@google.com on 5 Apr 2011 at 7:37

Attachments:

GoogleCodeExporter commented 8 years ago
Hi. I also have problems during conversion. I tried resolve them by myself with 
XML Copy Editor(spend too much time) but unsuccessfully =((
Please help me with file attached

Original comment by s...@sselin.com on 6 Apr 2011 at 11:53

Attachments:

GoogleCodeExporter commented 8 years ago
Here you go.  There was two illegal characters.  I've removed them without 
affecting the content.

Original comment by jlu...@google.com on 7 Apr 2011 at 4:05

Attachments:

GoogleCodeExporter commented 8 years ago
Thanks a lot!

Original comment by s...@sselin.com on 7 Apr 2011 at 4:10

GoogleCodeExporter commented 8 years ago
I keeping getting the bX-sz920t error, and it's driving me nuts!  Any help you 
might be able to offer would be greatly appreciated.

Original comment by illitera...@gmail.com on 12 Apr 2011 at 12:14

Attachments:

GoogleCodeExporter commented 8 years ago
So bad I can't upload the cleaned xml file to blogger, blog import always gives 
me server error and code. Anyone experiencing same problem with me? I've been 
out of luck for days now.

Original comment by rbbeltra...@gmail.com on 13 Apr 2011 at 9:46