manga-download / haruneko

Prototype of HakuNeko based on NW.js + TypeScript
https://haruneko-docs.pages.dev
147 stars 31 forks source link

MangaPlanet , Set-Cookies and Fetch #717

Open MikeZeDev opened 3 months ago

MikeZeDev commented 3 months ago

Context : Yanmaga, MangaPlaza, Mangaplanet. SpeedBinB viewer.

On those 3 websites, fetching the config using FetchJSON(request); gives the data properly, and on those websites response ask to setup 3 CloudFront cookies to control further access .

image

However, if we continue to trace our code, we encounter a access error (403) on the FetchJSON :

image

The problem is clearly the missing cookies in request (Keypair ID is one of the CloudFront one we need)

For Yanmaga & Mangaplaza, i solved this using a dirty method that is, letting the chapter page load for then call the decorator.

    public override async FetchPages(chapter: Chapter): Promise<Page[]> {
        await FetchWindowScript(new Request(new URL(chapter.Identifier, this.URI), { headers: { Referer: this.URI.origin } }), 'true', 3000);//set necessary cookies
        return SpeedBinb.FetchPagesSinglePageAjaxv016130.call(this, chapter);
    }

Dirty but it work. Its no use for MangaPlanet. For Mangaplanet i have to open the chapter URL in a window, and use a script to perform the config fetch from within the window 🤦

const JsonFetchScript = `
    new Promise(resolve, reject) => {
        try {
            fetch('{URI}')
                .then(response => response.json())
                .then(json => resolve(json))
        } catch {error} {
            reject(error);
        }
    });
`;

export async function FetchPagesSinglePageAjax(this: MangaScraper, chapter: Chapter): Promise<Page[]> {
    const { viewerUrl, SBHtmlElement } = await GetViewerData.call(this, chapter);
    const { request, sharingKey } = await CreatePtBinbRequestData(viewerUrl, SBHtmlElement);
    const config = await FetchWindowScript<JSONPageData>(new Request(viewerUrl), JsonFetchScript.replace('{URI}', request.url), 2000);
    return await getPageLinks_v016130.call(this, config.items[0], sharingKey, chapter);
}

For some mysterious reason, Fetching config like this properly set cookies for subsequent FetchJSON request.

MikeZeDev commented 3 days ago

Chrome browser prevent us to access set-cookie headers from Javascript. Tested on a naked node project and i was able to access set-cookies headers using native node fetch. image

MikeZeDev commented 3 days ago

Here is the simple application that works

import { JSDOM } from "jsdom";
global.DOMParser = new JSDOM().window.DOMParser;

main().catch(err => console.log(err.message, err.stack)).finally(() => {
    console.log('end');
});

async function main() {

    const url = 'https://mangaplanet.com/reader?cid=64f300af38575';
    const { viewerUrl, SBHtmlElement } = await GetViewerData(url);
    const { request, sharingKey } = await CreatePtBinbRequestData(viewerUrl, SBHtmlElement);

    let response = await fetch(request);
    console.log(response.headers.getSetCookie());
}

/**
 *  Return real chapter url & SpeedBinb "pages" element from said page, so we can work
 * @param this - A reference to the {@link MangaScraper} instance which will be used as context for this method
 * @param chapter - A reference to the {@link Chapter} which shall be assigned as parent for the extracted pages
 */
async function GetViewerData( chapterurl) {
    let viewerUrl = new URL(chapterurl);
    const request = new Request(viewerUrl, {
        headers: {
            Referer: 'https://mangaplanet.com',
            Cookie: 'mpaconf=18'
        }
    });

    const response = await fetch(request);
    const data = await response.text();
    const dom = new DOMParser().parseFromString(data, 'text/html');
    const SBHtmlElement = dom.querySelector('div#content.pages');
    //handle redirection. Sometimes chapter is redirected
    if (response.redirected) {
        viewerUrl = new URL(response.url);
    }
    return { viewerUrl, SBHtmlElement };
}

/**
 * Create first SpeedBinb AJAX request to perform, using viewerUrl GET parameters, and endpoint from SpeedBinb HTML node
 * @param viewerUrl - Read Url of the SpeedBinb Viewer
 * @param sbHtmlElement - HTMLElement extracted from said page
 */
async function CreatePtBinbRequestData(viewerUrl, sbHtmlElement) {
    let cid = viewerUrl.searchParams.get('cid') || sbHtmlElement.dataset['ptbinbCid'];
    /*
    //in case cid is not in url and not in html, try to get it from page redirected by Javascript/ Meta element
    if (!cid) {
        cid = await FetchWindowScript < string > (new Request(viewerUrl), 'new URL(window.location).searchParams.get("cid");', 5000);
    }*/
    if (!cid) throw new Error('Unable to find CID (content ID) !');

    const sharingKey = _tt(cid);
    const uri = getSanitizedURL(viewerUrl.href, sbHtmlElement.dataset.ptbinb);
    const dmytime = String(Date.now());
    uri.searchParams.set('cid', cid);
    uri.searchParams.set('dmytime', dmytime);
    uri.searchParams.set('k', sharingKey);

    const u0 = viewerUrl.searchParams.get('u0');
    const u1 = viewerUrl.searchParams.get('u1');
    if (u0) uri.searchParams.set('u0', u0);
    if (u1) uri.searchParams.set('u1', u1);

    const request = new Request(uri, {
        headers: {
            Referer: viewerUrl.href
        }
    });
    return { cid, sharingKey, dmytime, u0, u1, request };
}

function getSanitizedURL(base, append) {
    const baseURI = new URL(append, base + '/');
    baseURI.pathname = baseURI.pathname.replaceAll(/\/\/+/g, '/');
    return baseURI;
}

function _tt(t) {
    const n = Date.now().toString(16).padStart(16, 'x'); // w.getRandomString(16)
    const i = Array(Math.ceil(16 / t.length) + 1).join(t);
    const r = i.substring(0, 16);
    const e = i.substring(i.length - 16);

    let s = 0;
    let u = 0;
    let h = 0;
    return n.split("").map(function (t, i) {
        return s ^= n.charCodeAt(i),
            u ^= r.charCodeAt(i),
            h ^= e.charCodeAt(i),
            t + "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789-_"[s + u + h & 63];
    }).join("");
}
MikeZeDev commented 21 hours ago

The problem has to do with SameSite attribute of the cookies we need.