KurzonDax / nZEDbetter

An improved usenet indexer
nzedbetter.org
GNU General Public License v3.0
11 stars 4 forks source link

Manual Part Repair #70

Open akellymusic opened 10 years ago

akellymusic commented 10 years ago

I am on Ubuntu 13.10 and I would like to know how to manually run part repair after setting the option in the backend. Can someone please help with this?

Thanks

KurzonDax commented 10 years ago

Hey, sorry for the delay in getting back to you.

Running the part repair manually is pretty easy. Just open a new terminal window, or start a new SSH connection to your server, and enter the following at the command prompt:

cd /var/www/nZEDbetter/misc/update_scripts/threaded_scripts
./partrepair_manual.py

This will start the part repair process using the number of threads specified in the Site Settings page for Update Binaries Threads. At some point in the future, I'll create a separate setting for the number of part repair threads, but that has been low on the priority list. Just remember that each part repair thread requires a NNTP connection.

It should be noted that with the seemingly large number of DMCA take-down notices that Usenet providers have been receiving from the MPAA and RIAA thieves, don't expect miracles from part repair. To be honest, I don't even use part repair on my own server any more.

akellymusic commented 10 years ago

Thank you for the reply.

If part repair is sort of useless… then how do i get my parts table to decrease without having to truncate? It’s at 33 million+.. I thought part repair helped with that?

On Feb 10, 2014, at 4:38 AM, KurzonDax notifications@github.com wrote:

Hey, sorry for the delay in getting back to you.

Running the part repair manually is pretty easy. Just open a new terminal window, or start a new SSH connection to your server, and enter the following at the command prompt:

cd /var/www/nZEDbetter/misc/update_scripts/threaded_scripts ./partrepair_manual.py This will start the part repair process using the number of threads specified in the Site Settings page for Update Binaries Threads. At some point in the future, I'll create a separate setting for the number of part repair threads, but that has been low on the priority list. Just remember that each part repair thread requires a NNTP connection.

It should be noted that with the seemingly large number of DMCA take-down notices that Usenet providers have been receiving from the MPAA and RIAA thieves, don't expect miracles from part repair. To be honest, I don't even use part repair on my own server any more.

— Reply to this email directly or view it on GitHub.

KurzonDax commented 10 years ago

Part repair has seemed to be useless to me. You can definitely give it a shot to see how well it works.

The number of entries in the parts table can vary pretty wildly, and depends a lot on the number of groups your indexing. One thing to check is in the Admin section of the website, go to Script Settings, then click the Postprocessing tab. Scroll down to the Purge Processed Collections section.

Obviously make sure that Purge Processed Collections is set to true. The other setting to check is the Stale Collection Window in Hours. Generally, you can set this to anything from 4 to 24. I usually set it to 6. Basically, once an hour the purge process will scan the collections table to look for collections that have not had an update within 6 hours of either the group's oldest article or newest article. If it hasn't, it will look at the total number of binaries and parts for that collection and try to estimate how complete it is. If it's greater than what you have set for minimum release completeness in Site Settings, it will queue it to be converted to a release. If it doesn't have enough binaries or parts, it will purge the collection and related binaries/parts.

However, if you're indexing a large number of groups and trying to backfill them simultaneously, a parts table of 33 million isn't that uncommon. On my test server, I index about 240 groups. If I'm running backfill as well on those groups, I've seen the parts table climb to 45+ million. Even though I have 32GB of RAM in it, that's still a lot for it to handle and the time it takes to insert new parts starts climbing drastically.

There are a couple of things you can do to help:

  1. If the purge backlog gets really high (above 3000 or 4000), shut off backfill temporarily until the purge thread gets caught back up.
  2. Reduce the number of backfill threads. I generally run about 10 update binaries threads and 10 backfill threads. It could be that you're throwing more parts/binaries/collections at the database than it can realistically handle in a timely fashion.

The choke point becomes the release creation (specifically creating the NZB's) and then trying to purge the collections afterwards. What your system can handle is a function of how much RAM it has, the number/speed of the processor cores, and the speed of the hard drive throughput.