towerofpower256 / WritingExporter.SimpleExporter

Tool for exporting interactive stories from Writing.com, and exporting them to human-readable files.
MIT License
4 stars 1 forks source link

Error: Could not find interactive description #5

Closed left1000 closed 5 years ago

left1000 commented 5 years ago

anyone have any clue what I'm doing wrong? tried the url, the url minus the title, page 1, and outline page, they all failed utterly (to do anything except save the description of the story before entering it, and nothing else at all).

this is what it told me:

INFO - Main window ready

INFO - Opening story from Writing.com

INFO - Story opened: 885928-Television-Women-expand

INFO - Updating story content from Writing.com

ERROR - An error occurred while trying to export the story. System.Exception Couldn't find the interactive chapter's title

INFO - Opening story from Writing.com

INFO - Story opened: 1911783-Madisons-Freshman-15

INFO - Updating story content from Writing.com

ERROR - An error occurred while trying to export the story. System.Exception Couldn't find the interactive chapter's title

ERROR - Error while exporting story to HTML System.ArgumentNullException Value cannot be null. Parameter name: input

INFO - Opening story from Writing.com

INFO - Story opened: 1911783

INFO - Updating story content from Writing.com

INFO - Story export complete

INFO - Story update complete

INFO - Story exported to HTML

INFO - Updating story content from Writing.com

INFO - Story export complete

INFO - Story update complete

INFO - Opening story from Writing.com

ERROR - An error occured when trying to get the story info. System.Exception Couldn't find the interactive description

towerofpower256 commented 5 years ago

Hi left1000,

I'll have a look at it now, and try to export it myself. I'm up to chapter 11111221 (45/703), no issues so far.

Going by your log output, it sounds like;

I really should update the log output for the release version so that it outputs what chapter it was trying to get when it failed.

I'll let you know how I go.

left1000 commented 5 years ago

It failed instantly, I don't think it ever made it past 0 or 1 chapters grabbed. The ui briefly showed 0/totalchapters before the error popped up. It could be a .net related error I guess, I'm on windows7. I did download and install the .net file though from your github. The one you linked to. So perhaps you need another .net link to refer people to as well?

towerofpower256 commented 5 years ago

I wouldn't believe so. That error says that it was able to pull down the HTML for that chapter, but it couldn't find certain bits in the HTML. In the past it's been because Writing.com has sent back something that isn't the chapter. I've seen some crazy login pages for stories that've been made private, and weird HTML that's been snuck in by the author.

I'll work on adding some additional error handling to help identify why it doesn't like it.

towerofpower256 commented 5 years ago

Hey left1000, can you try running this attached version of the exporter? It's got some addition exception handling, and gives the option to show the HTML that Writing.com sent back. Using that, I can see what is being returned to you, and hopefully find out what's different.

WritingExporter.SimpleExporter v0.4.zip

left1000 commented 5 years ago

--snip--- Was raw HTML that Github didn't format correctly.

left1000 commented 5 years ago

The above is what I get from this url: https://www.writing.com/main/interact/item_id/1911783-Madisons-Freshman-15

also do a once over please... reassure me that gigantic block of text isn't full of my ip address or some such?

left1000 commented 5 years ago

Also I just realized that github is actually processing the html code... so here's a pastebin? https://pastebin.com/7mMkbEhL

towerofpower256 commented 5 years ago

Well this is interesting...

At current, it uses the following regex to look for the page title: (?<=<big><big><b>).*?(?=<\/b><\/big><\/big>) For some odd reason, the HTML that you're getting doesn't follow that standard, and the regex doesn't find anything.

Yours is: <span style="cursor:pointer;" title="Created: 01-04-2013">Madison's first night at college &nbsp; </span>

Pulling apart the WDC pages is tricky at best, as almost all of the elements don't have names. I've often thought that this is either really sloppy web design, or it's intentionally difficult to try and make scraping annoying.

Bear with me, I'll see if I can try and use a different regex to get the page's title.

left1000 commented 5 years ago

If it matters I have a paid account, so I never see ads or that splash screen asking me to wait 5 seconds or anything like that. Maybe if i made another free account it'd work? Although the main reason I want this to work is because in theory my paid account might work better/faster than a free one. If I have to make a free one to get it to work, that defeats the purpose, and I might as well just download from the dropbox/googledoc/etc.

towerofpower256 commented 5 years ago

Ah, that might be it. I'll see what I can do.

Absolutely, I'd want it to work with paid accounts to export the story much faster.

towerofpower256 commented 5 years ago

Alrighty, give this build a go.

WritingExporter.SimpleExporter v0.4.zip

left1000 commented 5 years ago

well you made progress!!! It got to the second chapter 11 It then still crashed right away, but crashing at 2 is better than 0 or 1. https://pastebin.com/2RP64S4E

towerofpower256 commented 5 years ago

What was the error, or which part of the chapter did it fail to grab?

I think I've been noticing some other HTML changes as I examine this issue, I think WDC is doing some changes to their layout.

left1000 commented 5 years ago

In the lower left corner of the program it just said ERROR then a popup occured and I clicked yes and I put that into the pastebin (it's very long with lots of html code).

Before it said ERROR it said 2/703 Exporting Chapter 11

towerofpower256 commented 5 years ago

Hi left1000,

Attached is a new build I'm working on. Can you give it a go and let me know if you still encounter the same issue?

It's still under development, but I'd like to know if it's solved this problem for you. Debug.zip

left1000 commented 5 years ago

7:23 AM-DEBUG-WritingExporter.Common.StorySync.StorySyncWorker: Story info does not need updating

7:23 AM-DEBUG-WritingExporter.Common.StorySync.StorySyncWorker: Story chapter outline does not need updating

7:23 AM-DEBUG-WritingExporter.Common.Wdc.WdcClient: Getting interactive story chapter: https://www.writing.com/main/interact/item_id/1911783-Madisons-Freshman-15/map/1

7:23 AM-WARN-WritingExporter.Common.StorySync.StorySyncWorker:

WritingExporter.Common.Exceptions.WritingClientHtmlParseException Couldn't find the chapter title for chapter 'https://www.writing.com/main/interact/item_id/1911783-Madisons-Freshman-15/map/1'

7:23 AM-WARN-WritingExporter.Common.StorySync.StorySyncWorker: Unhandled exception thrown by sync worker. System.IO.PathTooLongException The specified path, file name, or both are too long. The fully qualified file name must be less than 260 characters, and the directory name must be less than 248 characters.

7:23 AM-DEBUG-WritingExporter.Common.StorySync.StorySyncWorker: Pausing until next loop

7:23 AM-DEBUG-WritingExporter.Common.WdcStoryContainer: Checking if any stories need saving

7:23 AM-DEBUG-WritingExporter.Common.StorySync.StorySyncWorker: Pausing until next loop

7:23 AM-DEBUG-WritingExporter.Common.StorySync.StorySyncWorker: Working story: 1911783-Madisons-Freshman-15

7:23 AM-DEBUG-WritingExporter.Common.StorySync.StorySyncWorker: Story info does not need updating

7:23 AM-DEBUG-WritingExporter.Common.StorySync.StorySyncWorker: Story chapter outline does not need updating

7:23 AM-DEBUG-WritingExporter.Common.Wdc.WdcClient: Getting interactive story chapter: https://www.writing.com/main/interact/item_id/1911783-Madisons-Freshman-15/map/1

7:23 AM-WARN-WritingExporter.Common.StorySync.StorySyncWorker:

WritingExporter.Common.Exceptions.WritingClientHtmlParseException Couldn't find the chapter title for chapter 'https://www.writing.com/main/interact/item_id/1911783-Madisons-Freshman-15/map/1'

7:23 AM-WARN-WritingExporter.Common.StorySync.StorySyncWorker: Unhandled exception thrown by sync worker. System.IO.PathTooLongException The specified path, file name, or both are too long. The fully qualified file name must be less than 260 characters, and the directory name must be less than 248 characters.

7:23 AM-DEBUG-WritingExporter.Common.StorySync.StorySyncWorker: Pausing until next loop

7:23 AM-DEBUG-WritingExporter.Common.WdcStoryContainer: Checking if any stories need saving

7:23 AM-DEBUG-WritingExporter.Common.StorySync.StorySyncWorker: Pausing until next loop

left1000 commented 5 years ago

moved the folder from d:\blahblahblah\blahblahblah to d:\

and now

(moved to attachment) Log sample 2019-07-26.txt

left1000 commented 5 years ago

it went to 4/704 got stuck and then wouldn't even export anything to html when I tried to, so i didn't even get 4 chapters.

towerofpower256 commented 5 years ago

Alrighty, I've been going over things again and in reading between the lines, it looks like because you're a paid user, you get some sort of dynamically loading pages which work differently to the regular pages, where WDC will try to preemptively load chapters ahead of you.

It looks like there's a button to turn that off somewhere on that page. Out of curiosity, what happens if you turn the dynamic page loading off from within a browser, and then try to export again? Just try exporting with the SimpleExporter release version from GitHub, or the v0.4 attached earlier in this thread.

If that fails after disabling that dynamic page loading, then I fear that paid users may need their own set of reading functions, which could be "fun" :S

left1000 commented 5 years ago

That appears to have 100% solved the problem.

towerofpower256 commented 5 years ago

Eyyyy, that's what we want to hear!

left1000 commented 5 years ago

One more thing of course, The debug errors.... there was one that I was able to figure out myself, that the path name was too long, recall? well I've done some testing.

This is a valid folder to run the software from: D:\Writing.com Exporter.v0.4 for some reason needs to be moved to d root in order to work\ This is not: D:\myairbridge\W.04\

so the error about being too long..... was both true and untrue. not sure this error needs fixing, but the instructions should mention that this exe file can only be run inside one directory not two. total filepath name length is the error you get, but it's not strictly true.

towerofpower256 commented 5 years ago

Already applied a fix for that one. In the next version, the file dumps will use much shorter names, and have some of the information about the dump inside the file instead of the file's file name.

Cool beans, I'll close off this issue of this particular issue is fixed. I'll look into including a warning within the program that if you get this error message, it might help to turn off the WDC feature of auto loading the interactive chapters.