GetiPlayerAutomator / get-iplayer-automator

Moved to https://github.com/Ascoware/get-iplayer-automator! The goal of Get iPlayer Automator is to allow iTunes and your Mac to become the hub for your British Television experience regardless of where in the world you are. Currently, Get iPlayer Automator allows you to download and watch BBC and ITV shows on your Mac. Series-Link/PVR functionality ensures you will never miss your favourite shows. Programmes are fully tagged and added to iTunes automatically upon completion. It is simple and easy to use, and runs on any machine running Mac OS X 10.7 or later. And since the shows are in iTunes, it is extremely easy to transfer them to your iPod, iPhone, or Apple TV allowing you to enjoy your shows on the go or on your television.
https://github.com/Ascoware/get-iplayer-automator
GNU General Public License v3.0
127 stars 54 forks source link

Programmes with accented characters may corrupt download history #191

Closed willson556 closed 10 years ago

willson556 commented 10 years ago

From dar...@ingram.fi on April 11, 2013 10:21:57

What steps will reproduce the problem? In upgrading in the past to various -pre packages, including today from pre8 to pre10 (or pre8 to pre9) on fully-upgraded OSX10.8.3 the app crashes on start "not responding" before it even opens the GUI.

It appears from experimentation to be possibly related to the download history. Copy back download history and the app will not load. Delete and let it rebuild and it will load. Then copy back and it is back to not loading.

Other files seem not affected, such as copying back Queue.automatorqueue which returns the PVR queue.

It would be rather nice to not have to trash the download history all of the time! Whether it would be possible to have the app first "open" and then show each file loading, so if it gets stuck on a corrupt or unhappy file, it might show this more clearly to a user (even with option to isolate/delete troublesome file for those who don't mind wandering through /Application Support?

I can provide a copy of the download history file if this helps, but have not so far as it was unclear if there was any "private information" embedded. A quick skim in textpad didn't reveal anything. File size is c 950mb.

Original issue: http://code.google.com/p/get-iplayer-automator/issues/detail?id=195

willson556 commented 10 years ago

From dinkypumpkin on April 15, 2013 12:23:52

A download_history file of 950MB equates to hundreds of thousands of downloads. Have you been that prolific? I'm assuming from your report that you wiped the download history file when updating to pre.8, so it seems unlikely you downloaded that much in a month. The history file must be chock full of junk that is choking get_iplayer and thus causing GiA to seize up. The history file is loaded inside get_iplayer when it starts up, so the process is outside GiA's control.

You may have been hit by an extreme version of the problem reported in issue #180 . If you can, look for big blocks of repeated characters and see if they correspond to entries with accented characters in the programme title, or some other pattern. If you find anything like that, try to cut a few example entries into another file and post it so I can have a look. I couldn't reproduce the problem in issue #180 , and I still can't, either with GiA or get_iplayer alone. It would seem there may something screwy with get_iplayer that puts junk into the download history under some circumstances, but the problem needs to be isolated, so post any clues you may find.

For now, see if you can back up your history file and then edit it down to the past month or two of valid entries. It might be unwieldy to do in an editor, but you you can try the tail utility to extract however many lines you need from the end of the file. If that works, it should save some work in repopulating the history via "Permanently Skip..." for series-link entries already downloaded.

More generally, there is no need to keep items in the download history if they are no longer available for download or no longer part of your PVR list, so it's a good idea to give it a haircut every month if you're a heavy downloader. Unfortunately, get_iplayer doesn't provide a mechanism to trim the download history, so you'll have to do it in the GiA history editor (or edit the file directly).

willson556 commented 10 years ago

From dar...@ingram.fi on April 16, 2013 03:16:51

Hi. Thank you for the response.

Looking through the file did, as you note, reveal a LOT of strange texts! I have not been mirroring the entire BBC output:)

b01n3fkf|Servants|The True Story of Life Below Stairs: 1. Knowing Your Place|tv|1348957269|flashstd1|/Volumes/Scratch Drive/Servants/Servants.s00e01.The True Story of Life Below Stairs - 1. Knowing Your Place.mp4 b011y90v|Seonaidh (Shaun the Sheep)|10. ÃÃÂÃÂ

The reason why I've kept the download history has been BBC radio, at least, tends to leave archives up longer than TV, so in that case if I am pulling a regular series down, it avoids duplications. Would it be a big architecture change to have history files for each download source (BBC radio, podcast, itv etc) as that might make tracking errors easier ?

Now from memory the Shaun The Sheep referred to above might have displayed some gaelic characters in the title but there's nothing shown in the current cache. I vaguely remember they had a gaelic version so it was something gaelic (shaun the sheep).

willson556 commented 10 years ago

From dinkypumpkin on April 16, 2013 03:49:45

Yes, that's the Gaelic version of Shaun the Sheep. It looks you have the same problem issue #180 . For now, all I can recommend is to keep an eye out for programmes with accented characters and trim your history as needed. It will be a while before I can devote any time to this problem, so I'll leave this issue open to track it.

While it's true that some radio programmes have long-lived archives, those archives are generally not available via the original iPlayer service, and thus not available via get_iplayer. The newer iPlayer Radio site does provide access to those archives, but that's irrelevant to get_iplayer. With the exception of a few TV and radio series posted on a "series catch-up" basis, most TV and radio programmes on only available for 7 days, so you can be fairly aggressive about culling your download history. If you can no longer locate a previously-downloaded episode in the current search results (i.e., it's no longer in the programme data cache), you can delete it from the history.

Summary: Programmes with accented characters may corrupt download history (was: Possible upgrade corruption/download history)

willson556 commented 10 years ago

From dinkypumpkin on April 16, 2013 03:50:19

Issue 180 has been merged into this issue.

willson556 commented 10 years ago

From dar...@ingram.fi on April 16, 2013 03:53:40

OK. I don't know about the internals (I only use the GIA app) but for example "Early Music Show" shows a fair few titles but it seems they've also trimmed it down from what it used to be. I shall keep my eyes open !

Would it be worth considering an automatic history cull in the future within the application? Anyway thank you for your endeavours to this important application!

willson556 commented 10 years ago

From dinkypumpkin on April 17, 2013 04:34:47

History culling would never be automatic, but get_iplayer needs some way to cull on demand. If/when that happens, GiA could utilise it.

willson556 commented 10 years ago

From dinkypumpkin on October 25, 2013 15:48:57

Issue 298 has been merged into this issue.

willson556 commented 10 years ago

From dinkypumpkin on November 01, 2013 11:02:03

Deleted by Google: Comment #8 on issue 195 by tj.tuggey

With regard to point #3, BBC4 programmes are often repeated months or years later, BBC2 programmes may get a later outing on BBC4 or vice versa, radio 4 programmes are resurrected for radio 4 extra - often every 12 months or so - and so on, so it is worth keeping documentaries and radio dramas in the history. Having said that, I think my very long delays in GIA opening (it takes over half an hour from the double click) and the sudden closures I've been experiencing may be linked to a large download history and I would sacrifice it by deleting it, but I can't find it. It isn't in Application History and spotlight searching doesn't reveal it. Scrolling too quickly through the download history using Edit Download History to find the Pathe programme and others with accented characters leads to a spinning wheel. I have been able to delete items, but the save takes some time and it can crash, so one by one deleting could take some time. Any idea of where the download history might be lurking? (MacBook Pro OSX 10.8.5, GIA 1.5.6)

willson556 commented 10 years ago

From dinkypumpkin on November 01, 2013 11:14:16

See wiki: CleanInstall

willson556 commented 10 years ago

From dinkypumpkin on November 03, 2013 11:32:24

A partial fix for this has been posted in 1.5.7-pre.1. Due to the bug reported in issue #298 , GiA was in effect doubling the number of incorrectly-encoded UTF-8 characters every time the download history was saved. Thus any garbage section would grow much larger than its original size. get_iplayer itself needs some more work to prevent incorrectly-encoded characters going into the history file in the first place, so I'll leave this issue open.

willson556 commented 10 years ago

From dinkypumpkin on February 02, 2014 08:17:37

Main problem should have been fixed in 1.5.7. A final minor UTF-8 issue in get_iplayer was fixed in 1.6.2-pre.2. See wiki: PreReleaseBuilds

Status: Fixed