danielbachhuber / CoPress-Convert

For converting College Publisher 4, College Publisher 5 and other databases to WordPress eXtended RSS
GNU General Public License v2.0
8 stars 1 forks source link

CollegePublisher Databases #4

Open bhalpern opened 14 years ago

bhalpern commented 14 years ago

Hi All-

I wanted to report a issue that I discovered in migrating http://www.campustimes.org/ with this script. But first, thanks to Daniel, all of the folks at CoPress.org, and anybody else who has contributed to this script… its much easier than what my friend Brian at the Miami Hurricane had to go through a few years ago. Now to the bug… this issue applies to CP5, which is which database format I was working with. I am not familiar with CP4 and do not know how this affects those users, but it may still be partially pertinent. The CoPress-Convert program crashed when I tried to run my database, with the following return message: Traceback (most recent call last): File "CoPress-Convert.py", line 848, in main() File "CoPress-Convert.py", line 821, in main version,stories,images = importStories(verbose) File "CoPress-Convert.py", line 732, in importStories content_id = content_id[1] IndexError: list index out of range

It turns out… CollegePublisher databases are not in any semblance of a standard format. The good news is, its easily fixable with a couple of minor coding tweaks to this script. First, I opened my database in a database editor that made it easy to visualize columns (a SQL solution, or even Microsoft Excel or Access).

Let's start by looking at lines 733-746 of the script, beginning:

story = [content_id,line[1],"CP5 - MISSING",line[2],line[11],line[12],line[4],line[3]] # Normal CP 5

Basically, all you have to do to fix the issues is change the column numbers to match your database format. The array object on line 737 is supposed to automatically map the correct columns, but because my database didn't have headers, that wasn't possible. To fix it, I commented out 737-746 and used the format above for my 'story' variable in its place, with my custom column numbers as follows: story = [content_id,line[0],"Unknown",line[3],line[15],line[18],line[9],"Unknown"] # My strange CP 5

That modification allowed for successful export, but upon WordPress 3.0 import, the dates were all wrong. CP was also kind enough to store dates in a non-Unix readable format. To make things work, I had to add a new variable on line 733, right above 'story = …'.

 mydate = dateConvert(line[0])

This is a simple method that I had to write, and it is as follows: (I put this method code on line 775, but it could go anywhere). def dateConvert(date): date = date.split(" ") if date[1] == "Jan": date[1] = "01" elif date[1] == "Feb": date[1] = "02" elif date[1] == "Mar": date[1] = "03" elif date[1] == "Apr": date[1] = "04" elif date[1] == "May": date[1] = "05" elif date[1] == "Jun": date[1] = "06" elif date[1] == "Jul": date[1] = "07" elif date[1] == "Aug": date[1] = "08" elif date[1] == "Sep": date[1] = "09" elif date[1] == "Oct": date[1] = "10" elif date[1] == "Nov": date[1] = "11" elif date[1] == "Dec": date[1] = "12" else: date[1] = "12" tempdate = date[2] tempdate = tempdate[0:-1] if len(tempdate) == 1: tempdate = "0"+tempdate return date[3]+"-"+date[1]+"-"+tempdate+" 00:00:00"

After this additional modification, everything worked splendidly and I was able to import into WordPress 3.0.1, which is what I'm using for the new site. I hope this could be helpful to somebody else, and perhaps a permanent solution could be someday incorporated into the regular download. Thanks to Brad Orego for his help with the data conversion. Good luck to all of you with your new sites!

Bradley Campus Times University of Rochester Rochester, New York http://www.campustimes.org/

danielbachhuber commented 13 years ago

Hey Bradley, Thanks much for taking the time to report this issue. I'm doing another migration in a couple weeks and will update the conversion script at that point.

Cheers,

Daniel