ccloli / E-Hentai-Downloader

Download E-Hentai archive as zip file
GNU General Public License v3.0
1.81k stars 137 forks source link

Please update GP calculator #250

Open Triramama opened 10 months ago

Triramama commented 10 months ago

Hello, pretty recently an update rolled out on sadpanda decreasing GP cost:

Archive download costs were reduced from 20 GP/MB for donators and 30 GP/MB for non-donators, to 15 GP/MB for donators and 20 GP/MB for non-donators. (The costs are calculated by MB, or one million bytes, not MiB.)

Can you make changes in the script please?

Also I have an issue that might or might not be related to this, calculated gp cost value is way too low for some reason and thus false: image Any thoughts on that?

ccloli commented 10 months ago

Hello, pretty recently an update rolled out on sadpanda decreasing GP cost

I did notice that update, but haven't checked if it also applies to original image download.

calculated gp cost value is way too low for some reason and thus false

The gallery is counted by image limits, it doesn't cost your GP. Though it's weird that the cost is too small. I guess there's something wrong with file size? I remembered the file size was in MB before but it's now MiB?

ccloli commented 10 months ago

After spending ~10k credits (the GP cost is not react in realtime, it takes 1k to reserve future cost), I'm sure the cost of downloading original image is not changed.

By downloading an image which file size is 19936461, rounded to 20,000,000 = 20MB, 2k GP can download it 5 times, while if it's 15 GP/MB (then each download costs 300 GP) it should let me download 6 times until the next time it reserves the next 1k GP. So for now the cost is still in 20 GP/MB (or maybe 21 GP as I previously tested, in forum someone said it's 21 but I'm not sure if he is just reference my data).

As the old updates said:

The GP cost is the same as what it would be for resetting the image limit, and also the same as the archiver cost for including it in an archive for a donator.

At that time the cost is exact to download then reset image limits, each MB takes 10 limits, reset takes 2 GP/limit, which is exact to downloading with H@H, and exact to downloading with offical archiver if you're a donator. So at that time, if you're not a donator, compare with offical archiver, downloading original images with script somehow saves more GP.

The new changes aligns the cost of archiver and H@H, down to the same 20 GP/MB. So unless:

... using script to download old gallery is not a good idea. It costs the same as the offical archiver now, plus image limits to get image link, and may lead to ban if you download really fast.

Normal cases:

image

image


v1.34.9 is updated to fix the estimated cost is incorrect.

Triramama commented 9 months ago

Thanks for the update, you're very quick to respond! I gotta wonder, though, what is the advantage in using archive downloader over "viewing the original image"? It's costs twice as much GP, I don't see why would I ever care with it

ccloli commented 9 months ago

I gotta wonder, though, what is the advantage in using archive downloader over "viewing the original image"?

Before the changes, if you're not a donator, using offical archiver do more expensive than original image links. Though I remembered Tenboro said the limit is only set for scraper/leecher, a gap to stop them scrape the site without any limit (for non-donator the limit is counted by IP and nowadays switching an IP is way more easy than before), even though it's cheaper than offical archiver, the goal to limit them is achieved.

For now, the offical archiver is the same price as the original image links, so that unless you meet said situations, using script or any other mirror tools are meaningless. Offical archiver should let you download with up to 4 simultaneous streams, and the link can be re-download without any other cost in 7 days.

For more information, see EHWiki.

It's costs twice as much GP, I don't see why would I ever care with it

That's just a special case. For normal galleries, the cost should be the same or a bit cheaper than using script, like the last 2 pictures I posted. Some galleries may double the price because they required to repack.

Recreated archives (those that have not been downloaded by anyone for 30+ days and therefore require repacking by the server) have an x2 multiplier on their cost.

FooIbar commented 9 months ago

So for now the cost is still in 20 GP/MB (or maybe 21 GP as I previously tested, in forum someone said it's 21 but I'm not sure if he is just reference my data).

The wiki was recently updated to clarify the download cost.

If manual downloading is not available for free, the system will first attempt to deduct your Full Image Quota (FIQ). The cost is 20 FIQ per MB

  • If your FIQ balance is not enough to cover the cost, you will be charged 1,000 GP in exchange for 1,000 FIQ
  • Your FIQ will only last temporarily. It will be reset when the servers' memory cache is restarted, or the cache entry is pruned due to space constraints

Another recent change is now force reloading will cost 50 image limits instead of 5. https://forums.e-hentai.org/index.php?s=&showtopic=270304&view=findpost&p=6357619

ccloli commented 9 months ago

The wiki was recently updated to clarify the download cost.

Yep, the cost has been pointed out in update log. Though in the thread that talking about GP cost, some said it's 21, which is wrote in this script's wiki. Maybe that's because I mess MB and MiB? idk.

Another recent change is now force reloading will cost 50 image limits instead of 5.

Thanks for clarifying it out, I'll update it in wiki. That link will add "nl" query in url, to tell eh returns the image from raw server, so in this way the image is not loaded from H@H, that's why the cost is a bit higher.

Though the script doesn't support the way said in the thread (it only sends "nl" when it's needed, and the script can even disabled such feature, but it doesn't support send "nl" all the time), I can figure out why some other scrap tools add such feature, since loading from source server is overall more stable than loading from H@H.

But I'm still curious that costing 50 is too much. It do definitely kill such scrapers, but it may also affect normal user. In some region or some specific time range that network is not that stable, or the assigned H@H node is really garbage, you may need to reload it a lot. For that cost, you can load a 2.5MB raw image.

FooIbar commented 9 months ago

Yep, the cost has been pointed out in update log. Though in the thread that talking about GP cost, some said it's 21, which is wrote in this script's wiki. Maybe that's because I mess MB and MiB? idk.

Are you referring to this?

  • Gallery A (3.24 GB) "Manual" download cost: 70,000 GP (21.10 GP/MB)
  • Gallery B (1000.7 MB) "Manual" download cost: 21,000 GP (20.99 GP/MB)

Recently 10b changed all file size units displayed to binary prefix format. So Gallery A is 3.24 GiB = 3,478,923,510 B = 3,479 MB (20.12 GP/MB) Gallery B is 1000.7 MiB = 1,049,310,003 B = 1,049 MB (20.02 GP/MB)

In some region or some specific time range that network is not that stable, or the assigned H@H node is really garbage, you may need to reload it a lot.

The dispatcher of H@H has been changed recently as well, which should somewhat help with that?

ccloli commented 9 months ago

Recently 10b changed all file size units displayed to binary prefix format.

    var leastCost = page * perCost;
        // 1 point per 0.1 MB since August 2019, less than 0.1 MB will also be counted, so asumme each image size has the extra < 100 KB
    var normalCost = Math.ceil((size / 1e5) + page * (1 + perCost));
    var cost = leastCost;
    var gp = Math.ceil(size / 1e5) * 2 + page;

My bad, I forgot it's because I added a +1 offset to each page, I made it confused with that 21 GP costs. The script itself is counted by 2 GP by each 100 KB (or say 1 MB = 1,000,000 B which is 20) which is accure, but I added an offset for non-accuracy cases.

The dispatcher of H@H has been changed recently as well, which should somewhat help with that?

Probably, but Tenboro did point out my cases:

(Asia traffic has been a bit unstable overall due to a combination of GFW issues and CF aggroing on some third party viewers whose traffic look like "HTTP requests trying to impersonate browsers".)

Access EH in China requires proxy or VPN, which somewhat made it more stable or worse, depends on the network to proxy server is stable or not, both client to proxy and proxy to H@H are counted. Consider EX is using CloudFlare, too (wtf I didn't notice that before, is that really "safe" to EX?), both EH and EX may assign a better H@H server based on CloudFlare's GeoIP, but only in theory.

ccloli commented 9 months ago

BTW are you happened to be a donator? Can you help me to validate such settings?

In Image Load Settings that controls load image from H@H or not, there're 2 options only applied to donator, No [Modern/HTTPS] and No [Legacy/HTTP]. Can you help me to check:

I'm wonder if that setting requires 50 points now, and whethr reloading cost double or not.

FooIbar commented 9 months ago

The cost to load an image is 11 for 1600x and lower, 15 for 2400x. The cost to reload an image is not evidently affected by image size setting but varies from 50 to 55 (Idk why). Though I think it's better to ask Tenboro for confirmation.

That's for HTTPS, images won't load in HTTP.

ccloli commented 9 months ago

Thanks for your kindly test, I'd update it in next version.

The cost to load an image is 11 for 1600x and lower, 15 for 2400x.

Looks like it's +10 of normal H@H loads. Thankfully it's not +50.

Your result may be not that accure, because EH now counts down your image limit usage very fast, so unless you keep refreshing your "Overview" page, it'd drop 1-2 points in a second. Fun fact, if you keep refreshing to see your latest image limits, then it'll never go down.

The cost to reload an image is not evidently affected by image size setting but varies from 50 to 55 (Idk why).

I think it's related. From my test, for normal user, reloading a image is counted by your resolution setting cost (1280x/960x/auto = 1, 1600x = 3, 2400x = 5) + 50. That means when you reload an image, it should cost 51~55 depends on your resolution setting.

Interesting that it's the same as normal user to reload. Though I guess if you're a donator and if you've enabled such option to disable H@H loads, then you don't need that reload link since it's already loaded from E-Hentai source servers.

That's for HTTPS, images won't load in HTTP.

Though the image is not loaded, it still costs your limits. But I guess it should be the same as HTTPS.

Though I think it's better to ask Tenboro for confirmation.

That'd be true, though I think Tenboro is kind, from my situation, I still think I'm a bit guilty or something like that? Besides, for some personal reason, I barely say or post anything or communicate in public these years (probably except GitHub?), and never posted in EH forums.

But anyway, I think your data is clear enough, thanks again for your help. 😇

FooIbar commented 9 months ago

Though the image is not loaded, it still costs your limits. But I guess it should be the same as HTTPS.

FWIW, I did test with HTTP and the cost was about 60+ for load + reload. I didn't post it before because I couldn't get accurate numbers due to auto-reloading.

ccloli commented 9 months ago

FWIW, I did test with HTTP and the cost was about 60+ for load + reload. I didn't post it before because I couldn't get accurate numbers due to auto-reloading.

Thanks for that, consider it's load + reload, so I think it should be the same as HTTPS.

PowerWasher9000 commented 9 months ago

Is the updated calculator accurate for older galleries? I downloaded a 730mb gallery (archive is worth 30,637 GP) from 2010, and the script estimate was 1010 + 16328 GP. I downloaded using this script and it only costed around 5000+ GP.

ccloli commented 9 months ago

Is the updated calculator accurate for older galleries?

Yes and no, the base calculation is accurate, since it's based on what I tested on EH, but I added a buffer in case the filesize of each image may different, so the final estimated cost may a bit higher than expected.

I downloaded a 730mb gallery (archive is worth 30,637 GP) from 2010, and the script estimate was 1010 + 16328 GP.

Then you hit the gallery that requires "repack", a gallery has been archived before but not downloaded in 7 days, the cost to download such archive will be double the price. 30637 is close to the double of estimated cost - offset buffer, which is 16328 - 1010 = 15318.

I downloaded using this script and it only costed around 5000+ GP.

If the image is smaller than your image resize setting in EH and file is not big, say an image is 500KB in 1200x, or if you enabled source nexus hath perk and file is smaller than 3MB, then the image delivered over H@H may be the original image. These images don't have the download original link, because the image showed in page is the original image already, which only costs your image limits instead of GP.

FooIbar commented 9 months ago

Then you hit the gallery that requires "repack", a gallery has been archived before but not downloaded in 7 days

*not downloaded in 30 days

ccloli commented 9 months ago

Then you hit the gallery that requires "repack", a gallery has been archived before but not downloaded in 7 days

*not downloaded in 30 days

My bad, I made it wrong with free to re-download. 🤣 Thanks for correction.

PowerWasher9000 commented 9 months ago

Ah, thank you very much for clarifying. I'm just going to have to wing it with the estimates then. It's a bit scary at times because some bigger galleries show 100k+ estimates.