bda-research / node-crawler

Web Crawler/Spider for NodeJS + server-side jQuery ;-)
MIT License
6.69k stars 876 forks source link

the encoding declaration in @types/crawler #464

Closed wubinhong closed 8 months ago

wubinhong commented 9 months ago

Issue Description

In the tutorial of Crawler

I cannot assign a null value to CrawlerRequestOptions, when following the tutorial as below:

image

Trouble-shooting

encoding defined in @types/crawler

    interface CrawlerRequestOptions {
        encoding?: string | undefined;
    }

The implementation for encoding in file lib/crawler.js

image

Suggestion

Solution 1 (Recommended)

Change the encoding declaration in @types/crawler as below

    interface CrawlerRequestOptions {
        encoding?: string | undefined | null;
    }

Solution 2

Change the implementation of handling encoding as below:

Crawler.prototype._doEncoding = function (options, response) {
    var self = this;

    if (!options.encoding) {
        return;
    }

    if (options.forceUTF8) {
        var charset = options.incomingEncoding || self._parseCharset(response);
        response.charset = charset;
        log('debug', 'Charset ' + charset);

        if (charset !== 'utf-8' && charset !== 'ascii') {// convert response.body into 'utf-8' encoded buffer
            response.body = iconvLite.decode(response.body, charset);
        }
    }

    response.body = response.body.toString();
};
mike442144 commented 8 months ago

Hi guy, I think this should be a bug of type inference in typescript. I don't like typescript, also suggest you no to use it.