otwstephanie / otwarchive3

Clone of OTWarchive, for merge testing
0 stars 0 forks source link

Upload from URL has some problems #351

Closed otwstephanie closed 10 years ago

otwstephanie commented 10 years ago

From Black0Sa...@gmail.com on September 12, 2008 23:50:34

What archive revision are you testing on? (rev.666 #) What steps will reproduce the problem? 1. Click 'Post New'

  1. Click 'Upload from an existing URL?'
  2. Paste in a variety of different URLS and see what happens

Posting from my LJ http://black-samvara.livejournal.com/381224.html - didn’t load content of post, have header info instead http://black-samvara.livejournal.com/379835.html - loaded beautifully.

Some random examples from the SPNnewsletter. http://se-parsons.livejournal.com/895277.html - “Sorry, but we couldn't read from that URL. :(“ http://x-strangeangels.livejournal.com/42435.html - loaded title as “Adult Content Notice” not logged in as yourself is probably a bad idea :p http://apreludetoanend.livejournal.com/100193.html - loaded title as “apreludetoanend: Triptych [Gen, PG-13]

Original issue: http://code.google.com/p/otwarchive/issues/detail?id=351

otwstephanie commented 10 years ago

From Black0Sa...@gmail.com on September 12, 2008 21:51:54

To find the works in the archive http://testarchive.transformativeworks.org/en/tags/194

otwstephanie commented 10 years ago

From shal...@gmail.com on September 14, 2008 09:10:58

Will look into the story scanning, although there are likely to just be some limitations on how well it can go. One solution will probably be to just offer the user the option to just load up the entire HTML of the body as a fallback.

Marking this as post-beta because it's not urgent. We should probably mark the upload-from-existing as an experimental feature for beta.

Status: Accepted
Labels: Milestone-PostBeta

otwstephanie commented 10 years ago

From Black0Sa...@gmail.com on September 17, 2008 00:40:01

Some email feedback:

Archive is hanging on a load from URL request to fanfiction.net

( http://www.fanfiction.net/s/277511/1/Mindless_Fun ) It's been a time measurable in minutes with no response from the archive. (Sorry, I really don't know how many minutes, I started doing some other stuff and lost track.)

Upload from URL feature did not fully upload http://web.archive.org/web/20040310174832/http://witchqueen.diary-x.com/journal.cgi?entry=20040108b It choked on a —. That was the em-dash character in IS0-88951, not an —

character entity. The archive uploaded all of the text from before that point, but not that symbol or anything which followed it.

URL upload failure for url http://remix.illuminatedtext.com/dbfiction.php?fiction_id=441 I got an error message

"Sorry, but we couldn't read from that URL. :(", with no attempt to upload

URL upload failed for http://www.innergeekdom.net/Twice/12-01.htm Got error message

"Sorry, but we couldn't read from that URL. :(". You know what would be useful? If that error sent you to the regular upload page after it failed to upload by URL, because you are probably still going to want to upload the same story, but you'll have to do it with cut and paste.

otwstephanie commented 10 years ago

From highlander.ii@gmail.com on September 19, 2008 17:52:43

Additional feedback:

otwstephanie commented 10 years ago

From Black0Sa...@gmail.com on September 27, 2008 18:48:27

Archive revision: (rev.796)

Feedback:

Naomi said in her recent e-mail, "Where would you like to be able to use "upload from url" from, where it's not already working?" I had no idea that was only supposed to be working from specific fan fiction archives, so as a piece of feedback, I should tell you that I have been trying to upload from URLs from my own site, where each story is formatted exactly the same as any other story, with standard headers, footers, CSS, etc.. Just so you have a data point, I have succeeded in uploading two out of the about 20 stories I have tried to upload from that site. One of the stories did a pretty good job auto-populating the metadata fields, and one of them didn't. I don't know if this is useful information for you but I figured I would provide it anyway.

The URL is http://suberic.net/~jadelennox/

otwstephanie commented 10 years ago

From Black0Sa...@gmail.com on October 10, 2008 17:08:44

Feedback: # 116

Upload from URL has been working really well for stories from my site, but for some reason http://rivkat.com/spn/three.html only uploaded as far as the first part of the first word-- It'

Thank you!

Sent at: 09:35AM EDT Sun 05 October 2008

Feedback # 124

Upload from URL did a funky thing to this story, http://rivkat.com/smallville/angel.html : The upload stopped at the first nonstandard character, the double-dotted i in naïve.

Summary: Upload form URL has some problems

otwstephanie commented 10 years ago

From Black0Sa...@gmail.com on November 01, 2008 18:03:37

More feedback:

I used the 'post from another url' thing to post from the Yuletide archive; it did not work for the title, summary etc; it worked for the main story, except for putting three line breaks between each paragraph, like this:

i am not sure how that would've come out if I'd just posted it like that, as I took them out before I previewed it. But since I don't actually have a copy of the story myself anymore, that is a pretty good feature!

otwstephanie commented 10 years ago

From Black0Sa...@gmail.com on November 02, 2008 21:51:07

Merged in from Issue 460 .

Comment 1 by highlander.ii, Sep 20, 2008

In addition to this, could we have something in the code that will strip out random LJ/IJ/JF specific pieces that are not related to the fic? Items like 'mood theme icon', 'entry tags', etc.

Summary: Upload ftom URL has some problems

otwstephanie commented 10 years ago

From Black0Sa...@gmail.com on November 02, 2008 21:58:36

Merged in from Issue 461 URL upload also failed on the following: (including for diagnostic purposes) http://www.innergeekdom.net/Twice/12-01 http://remix.illuminatedtext.com/dbfiction.php?fiction_id=441 http://web.archive.org/web/20040310174832/http://witchqueen.diary-x.com/journal.cgi?entry=20040108b (didn't fully upload - hung on em-dash character in IS0-88951)

otwstephanie commented 10 years ago

From Black0Sa...@gmail.com on January 06, 2009 23:44:17

Merged in from http://code.google.com/p/otwarchive-feedback/issues/detail?id=308 Here's some user feedback on the archive!

Archive revision: (rev.953)

Feedback:

A bit of weirdness with the "upload from existing URL" business. Lately it's been posting the text of the story TWICE.

For example, I fed it this link: http://home.teleport.com/~punkm/lipstick.html And I get the entire story, twice, in the story field.

Sent at: 06:16PM EST Mon 05 January 2009

otwstephanie commented 10 years ago

From eni...@gmail.com on February 19, 2009 03:52:26

Summary: Upload from URL has some problems
Labels: Component-Application

otwstephanie commented 10 years ago

From eni...@gmail.com on March 06, 2009 14:03:03

Labels: Milestone-Internal0.8

otwstephanie commented 10 years ago

From nleon...@gmail.com on September 14, 2009 11:05:18

Collected url, the title the story should have, a sentence copied & pasted from the end of the story, the fandom, and the rating, with the exception of the first two links ( http://black-samvara.livejournal.com/381224.html and http://black-samvara.livejournal.com/379835.html ) and duplicate websites.

Attachment: Upload from URL test data.txt

otwstephanie commented 10 years ago

From shal...@gmail.com on September 18, 2009 23:09:13

Fixed in r1466 . All of the urls described here (that are still valid) should now work at least to the extent of being loaded into the storyparser and producing a valid draft.

Status: FixedAndCommited
Owner: shalott

otwstephanie commented 10 years ago

From amelia4OTW@gmail.com on September 28, 2009 13:51:27

as of r1537

Status: FixedButUnverified

otwstephanie commented 10 years ago

From amberlee...@yahoo.com on October 02, 2009 16:29:09

Working on Test revision 1537 from a laptop PC running Windows XP service pack 3 and using firefox v.3.5.2

I am unsure if this is helpful or not given the context of the above (and that I am attempting different imports). However I have replicated situations by using the same or similar archives (FFN, Insane Journal, M/A archive, archive run by efiction, etc). Here we go.

Attempted the following imports:

1) From http://www.fanfiction.net/secure/story/story_preview.php?storyid=1540312 Produces no import. Gives archive response of: We're sorry, but something went wrong. We've been notified about this issue and we'll take a look at it shortly. It should also be noted that I tried to import from FFN before (previous test version with two different stories of a single chapter) and got this exact error. I was working on some other things at the time and noted that this one was open so I left the issue alone.

2) From http://hlfiction.net/viewstory.php?sid=916 (Highlander Fan Fiction Archive running on "efiction" base) Story imports but contains header, link, and other information from the archive. Also retains archive footer. Import of title also included author information "by XXX" Tried to post test story with additional imported header/footer information intact for inspection if necessary but did not have option to post. Draft is here: http://testarchive.transformativeworks.org/works/6295 3) http://yaoiville.net/viewstory.php?sid=68&i=1 Story imports but contains header, link, footer and other information from the host archive. Import of title did not occur and, instead, the name of the archive appeared in the story title area. Draft gave option to post unlike the previous work. Also, text of work had double dashes and emdashes in the story. These imported as crazy symbols instead of punctuation. Posted story (with all import information) is here: http://testarchive.transformativeworks.org/works/6296 4) http://www.masterapprentice.org/archive/a/always_1.html Title imported correctly. Summary and Notes began import but stopped after a limited number of words/characters. Some archive header information then imported into the story. Story itself did not fully imported, but stopped at a seemingly random location in the file (after 2181 words). Draft gave option to post, so I posted the story as was so the import could be seen. http://testarchive.transformativeworks.org/works/6297 5) http://amberleewriter.insanejournal.com/56099.html (and all nine additional chapters each in individual insanejournal posts) Attempt to past ten individual URLs as chapters into the import window returned the post new work page with the following error in a pink box: We couldn't save this work, sorry! Here are the problems we found: * Title can't be blank * Title must be at least 1 characters long. The entire post new work page was blank. After a little thought, I wondered if all of the chapters were public entries. I also wondered if this was due to multiple URLs. I tried again below ---

6) http://amberleewriter.insanejournal.com/56099.html (this time as the only url and checked to make sure that this was a public entry) Import worked. Title imported correctly. Summary imported correctly (and complete). Notes also imported correctly. All of tags and other info in header of post imported (but I would expect this with a journal entry so no biggie). While some line breaks (spacing with br or p tags) did not import, all italic, blockquotes, and bold text imported without a problem. Dashes and emdashes imported correctly and show up instead of being strange symbols as with the other archive.

Am setting this to "assigned" as test doesn't show that FFNet or livejournal based code (previous import locations) are necessarily working properly for me. If coders feel exact replication is required and would rather work only with previous tester/feel this is a resolution, please set it back and I'll bug off! --Amberlee

Status: Assigned

otwstephanie commented 10 years ago

From autu...@gmail.com on November 23, 2009 18:51:32

I feel bad appending to an already monster sized issue. This is a feedback that was brought from the yuletide chatroom to 16bugs on Nov 14 (before 7.1.1). User having issues trying to import from ficwad. User says: "It imports the entire page, including all the links. It doesn’t grab any of the html tagging in the fic."

This isn't much help since we don't have an url but just documenting. 16Bugs #258

otwstephanie commented 10 years ago

From awil...@gmail.com on January 30, 2010 00:30:45

I too feel bad, but one user cannot import from WordPress without all of her navigation going into the story text. The URL is http://www.branchandroot.net/archive/2004/04/security/

otwstephanie commented 10 years ago

From mooncros...@gmail.com on January 30, 2010 04:18:02

Just dropping my 2c in: I'm also uploading from WordPress, and the same thing as in comment #18 happens to me too (has been happening as far back as I can remember...). But I don't see that as a real issue. To be honest, I don't expect the archive to know exactly which part of the WordPress page is the actual story, and which part isn't. Especially not since there are so many WP themes out there....

I have to go through the uploaded file anyway to add all the metadata (tags, characters, etc), and it's easy to cut the superfluous parts of the navigation from the story text. I do love how it understands which part is the Author notes, though!

(sample url if someone wants to see what happens: http://jericho.scribblesinink.com/times-like-these/ )

otwstephanie commented 10 years ago

From eni...@gmail.com on February 28, 2010 09:29:10

Labels: -component-Application Component-BackEnd

otwstephanie commented 10 years ago

From anzu.kaiba on March 06, 2010 23:05:23

I imported several works from FFnet, but I got this: Failed Imports http://www.fanfiction.net/s/175218/1/Only_16 Please enter your story in the text field below.

The other stories appear to have successfully imported, but there is no text field "below" the error message. Attached is screencap of the error.

Attachment: Google Chrome004.png

otwstephanie commented 10 years ago

From anzu.kaiba on March 12, 2010 19:53:43

I've also tried on the regular archive (not the testarchive), and I got this: http://www.fanfiction.net/s/175218/1/Only_16 /usr/lib/ruby/1.8/timeout.rb:60:in `timeout': execution expired

otwstephanie commented 10 years ago

From eni...@gmail.com on April 19, 2010 08:31:32

Labels: -Roadmap-Work Roadmap-WorkImporting

otwstephanie commented 10 years ago

From autu...@gmail.com on November 14, 2010 11:54:39

Bulk adding NeedsTest labels to all open issues

Labels: NeedsTests

otwstephanie commented 10 years ago

From hele...@gmail.com on October 31, 2011 21:39:50

Problems with importing from insanejournal: https://otw.16bugs.com/projects/4911/bugs/206851 (Attempted import imports comments in the body of the story, and not the entry. User says it imports text outside of cut, though I was able to reproduce the comments part, I can't make it import anything else.)

otwstephanie commented 10 years ago

From jenn.cal...@googlemail.com on May 26, 2012 15:42:01

All links moved to the wiki page: http://wiki.transformativeworks.org/mediawiki/Import_Issues and issue closed as per Op Cobra, new links etc should be added to the page instead.

Status: Invalid