sophie-glk / bang

A firefox addon which adds bangs (known from duckduckgo) to various search engines.
MIT License
46 stars 5 forks source link

[Feature Request] Autoupdate bangs.json #6

Closed serovar closed 4 years ago

serovar commented 4 years ago

If you use bangs that translate from a language to another (e.g. !enit), the search engine redirects you to Google Translate without adding the searched words in the translation form (on DuckDuckgo it works fine).

Edit: thanks to @babastienne for pointing out that the issue is related to a problem with the bangs.json file.

babastienne commented 4 years ago

I don't know how you tried but it's working fine for me.

What you probably did :

You configured your bang as enit and the associated search as https://translate.google.com/#view=home&op=translate&sl=en&tl=it&text=@search@

What you should have done :

When looking at the duckduckgo research you can observe that the search is not done through query params but directly in the url. So the right associated search to configure is this one : https://translate.google.com/#en/it/@search@

Try with this and if it's still not working could you please send a screenshot of your extension configuration ?

serovar commented 4 years ago

Thanks, configuring it manually as you suggest makes it work.

Regarding my previous configuration, I did not setup it manually, I was trying it as it is normally on DuckDuckGo. Is there not a way to make it work by default? It would be tedious to manually reconfigure every Google Translate bang.

babastienne commented 4 years ago

Quite not sure about this. Maybe @sophie-glk can answer this.

babastienne commented 4 years ago

Well it seems to work natively with almost all the bangs supported in DuckDuckGo.

Actually the supported bangs are all in the file bangs.json on this project. So when looking on this file you can see that all the translations redirects immediately on the home page of google translate (it explains why it wasn't working). For demauro it redirects on the 404 error page.

So I assume it works most of the time but for some bangs the json dictionary used isn't quite right.

I'm curious @sophie-glk where did you get this json ? Did you plan an automatic update in case some website addresses changes ?

serovar commented 4 years ago

So I suppose that they have been updated in DuckDuckGo in these specific cases, since they do work there, but have not been updated in the json file. In every case, the autoupdating of this file would be very appreciated.

sophie-glk commented 4 years ago

The issue seems to be with the way i get the bangs. I get the list of all available using this command: curl https://duckduckgo.com/bang_lite.html | grep -A100000 "And here's the full list alphabetically:" | awk 'NR>1{print $1}' RS='(' FS=')' | grep ! > bangs.txt And after that i use the following puppeteer [https://github.com/GoogleChrome/puppeteer] script:

process.setMaxListeners(0);
const puppeteer = require('puppeteer');
var fs = require('fs');
var bangs = fs.readFileSync('bangs.txt').toString().split("\n");
console.log(bangs);
var t = 12;
var n = 0;
var f = 0;
var result = [];
var m = bangs.length / t
for (var n = 0; n < t; n++) {
    get(Math.floor(n * m), Math.floor((n + 1) * m), function(rs) {
        result = result.concat(rs);
        f++;
        if (f == t) {
            var json = JSON.stringify(result);
            fs.writeFile("output", json, function(err) {
                if (err) {
                    return console.log(err);
                }
            });
        }
    });

}

async function get(s, e, callback) {
    var rs = [];
        const browser = await puppeteer.launch({
            headless: false
        });
        const page = await browser.newPage();
    for (var i = s; i < e; i++) {
        var b = bangs[i]
        try {
            await page.goto('https://duckduckgo.com/?q=' + b + " bang");
        } catch (e) {}
        try {
            await page.waitForNavigation();
            rs.push([b, page.url()]);
        } catch (e) {}
        console.log(i + " " + page.url());
        console.log(rs.length);
        n++;
        console.log(n);

    }

    await browser.close();
    callback(rs);

}

(I know it's not a pretty solution, but its the one i got working) It seems to be working on most pages, but not on google translate for some reason. We could try to find a way to fix this, a better way to get the bangs or just fix it manually by writing a script. I am open to suggestions!

serovar commented 4 years ago

I am not a programmer unfortunately, so I don't really have technical suggestions, but if there is something that I could test or do just let me know.

sophie-glk commented 4 years ago

I think i have fixed it, the issue seems to be this: [https://github.com/GoogleChrome/puppeteer/issues/257]. I now use await page.evaluate(() => location.href) instead of page.url();

serovar commented 4 years ago

Thanks! Will the addon update include an autoupdate mechanism or a manual one for the bangs.json file?

sophie-glk commented 4 years ago

It would probably be a good idea to implement auto updates of the banglist, as it would make fixing bugs like this a lot easier. I will look into it.

sophie-glk commented 4 years ago

The feature is now fully implemented. As for now it has to be done manually through the settings, if it doesn't cause any issues it will be done automatically in the next release.