jackfranklin / tweet-parser

Parsing tweets into lists of entities.
76 stars 7 forks source link

NodeJS support #1

Open gmvivekanandan opened 2 years ago

gmvivekanandan commented 2 years ago

This project is exactly what I need for my project, cool project and it perfectly suits my use case,but is there any way I could use this with Node JS?

PeterPilley commented 2 years ago

+1

PeterPilley commented 2 years ago

A quick hacky version that works in nodejs hope this helps others, I will create a fork to make this into a proper module later.

function parse(data){

    let user_list = [];
    let hashtag_list = [];
    let url_list = [];

    // Clean up data replacing line returns with spaces
    data = data.replace(/(\r\n|\n|\\n|\r)/gm, " ");

    // Regex from Vincent Loys awesome tweetparser
    const REGEX_URL = /(?:\s)(f|ht)tps?:\/\/([^\s\t\r\n<]*[^\s\t\r\n<)*_,\.])/g, //regex for urls
        REGEX_USER = /\B@([a-zA-Z0-9_]+)/g, //regex for @users
        REGEX_HASHTAG = /\B(#[á-úÁ-Úä-üÄ-Üa-zA-Z0-9_]+)/g; //regex for #hashtags

    // Process urls
    data.replace(REGEX_URL, function (url) {
        url_list.push(url.trim())
    });

    // Process users
    data.replace(REGEX_USER, function (user) {
        user_list.push(user.slice(1));
    });

    // Process hashtags
    data.replace(REGEX_HASHTAG, function (hashtag) {
        hashtag_list.push(hashtag.slice(1));
    });

    // Return 
    return {
        original_message: data,
        users: user_list,
        hashtags: hashtag_list,
        urls: url_list
    };

}

module.exports = {
    parse: parse
};

how to use in nodejs (note: tweet chosen at random)

const tp = require('./tweeterParser');
let tweet = 'My first batch of contributions are now live in \n' +
    '@gitlab\n' +
    ' 15.2. Helping make the Terraform Module registry API better for everyone!\n' +
    '\n' +
    'Lots of learning and refreshing Ruby/Rails 🎉';

console.log(tp.parse(tweet));

Console:

{
  original_message: 'My first batch of contributions are now live in  @gitlab  15.2. Helping make the Terraform Module registry API better for everyone!  Lots of learning and refreshing Ruby/Rails 🎉',
  users: [ 'gitlab' ],
  hashtags: [],
  urls: []
}
jackfranklin commented 2 years ago

Hi all,

I am happy to give this package an update and publish a version to Node. I will try to take a look this week.

Thanks.

On Mon, 25 Jul 2022, 22:32 PeterPilley, @.***> wrote:

A quick hacky version that works in nodejs hope this helps others, I will create a fork to make this into a proper module later.

` function parse(data){

let user_list = [];

let hashtag_list = [];

let url_list = [];

// Clean up data replacing line returns with spaces

data = data.replace(/(\r\n|\n|\n|\r)/gm, " ");

// Regex from Vincent Loys awesome tweetparser

const REGEXURL = /(?:\s)(f|ht)tps?:\/\/([^\s\t\r\n<][^\s\t\r\n<),.])/g, //regex for urls

REGEX_USER = /\B@([a-zA-Z0-9_]+)/g, //regex for @users

REGEX_HASHTAG = /\B(#[á-úÁ-Úä-üÄ-Üa-zA-Z0-9_]+)/g; //regex for #hashtags

// Process urls

data.replace(REGEX_URL, function (url) {

url_list.push(url.trim())

});

// Process users

data.replace(REGEX_USER, function (user) {

user_list.push(user.slice(1));

});

// Process hashtags

data.replace(REGEX_HASHTAG, function (hashtag) {

hashtag_list.push(hashtag.slice(1));

});

// Return

return {

original_message: data,

users: user_list,

hashtags: hashtag_list,

urls: url_list

};

}

module.exports = { parse: parse }; `

how to use in nodejs (note: tweet chosen at random) ` const tp = require('./tweeterParser'); let tweet = 'My first batch of contributions are now live in \n' + @.*** https://github.com/GitLab\n' + ' 15.2. Helping make the Terraform Module registry API better for everyone!\n' + '\n' + 'Lots of learning and refreshing Ruby/Rails 🎉';

console.log(tp.parse(tweet)); Console:{ original_message: 'My first batch of contributions are now live in @GitLab https://github.com/GitLab 15.2. Helping make the Terraform Module registry API better for everyone! Lots of learning and refreshing Ruby/Rails 🎉', users: [ 'gitlab' ], hashtags: [], urls: [] }`

— Reply to this email directly, view it on GitHub https://github.com/jackfranklin/tweet-parser/issues/1#issuecomment-1194656139, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABPFVQQ4ZLIQNDX73P6BKLVV4BV5ANCNFSM5NT6TVAA . You are receiving this because you are subscribed to this thread.Message ID: @.***>

PeterPilley commented 2 years ago

Hi all, I am happy to give this package an update and publish a version to Node. I will try to take a look this week. Thanks. On Mon, 25 Jul 2022, 22:32 PeterPilley, @.> wrote: A quick hacky version that works in nodejs hope this helps others, I will create a fork to make this into a proper module later. ` function parse(data){ let user_list = []; let hashtag_list = []; let url_list = []; // Clean up data replacing line returns with spaces data = data.replace(/(\r\n|\n|\n|\r)/gm, " "); // Regex from Vincent Loys awesome tweetparser const REGEX_URL = /(?:\s)(f|ht)tps?:\/\/([^\s\t\r\n<][^\s\t\r\n<)_,.])/g, //regex for urls REGEXUSER = /\B@([a-zA-Z0-9]+)/g, //regex for @users REGEXHASHTAG = /\B(#[á-úÁ-Úä-üÄ-Üa-zA-Z0-9]+)/g; //regex for #hashtags // Process urls data.replace(REGEX_URL, function (url) { url_list.push(url.trim()) }); // Process users data.replace(REGEX_USER, function (user) { user_list.push(user.slice(1)); }); // Process hashtags data.replace(REGEX_HASHTAG, function (hashtag) { hashtag_list.push(hashtag.slice(1)); }); // Return return { original_message: data, users: user_list, hashtags: hashtag_list, urls: url_list }; } module.exports = { parse: parse }; how to use in nodejs (note: tweet chosen at random) const tp = require('./tweeterParser'); let tweet = 'My first batch of contributions are now live in \n' + @. <https://github.com/GitLab>\n' + ' 15.2. Helping make the Terraform Module registry API better for everyone!\n' + '\n' + 'Lots of learning and refreshing Ruby/Rails '; console.log(tp.parse(tweet)); Console:{ original_message: 'My first batch of contributions are now live in @GitLab https://github.com/GitLab 15.2. Helping make the Terraform Module registry API better for everyone! Lots of learning and refreshing Ruby/Rails ', users: [ 'gitlab' ], hashtags: [], urls: [] }` — Reply to this email directly, view it on GitHub <#1 (comment)>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABPFVQQ4ZLIQNDX73P6BKLVV4BV5ANCNFSM5NT6TVAA . You are receiving this because you are subscribed to this thread.Message ID: @.***>

Happy to help if you want/need I think it would be cool to have a vue.js module as well

PeterPilley commented 1 year ago

Hi I have made some updates locally on our version, are you happy to recieve PR? I have added features to identify and extract emojis using the awesome regex from here https://medium.com/reactnative/emojis-in-javascript-f693d0eb79fb

works really well

jackfranklin commented 1 year ago

Very happy to review a PR! Thanks

On Sun, 27 Nov 2022 at 00:46, PeterPilley @.***> wrote:

Hi I have made some updates locally on our version, are you happy to recieve PR? I have added features to identify and extract emojis using the awesome regex from here https://medium.com/reactnative/emojis-in-javascript-f693d0eb79fb

works really well

— Reply to this email directly, view it on GitHub https://github.com/jackfranklin/tweet-parser/issues/1#issuecomment-1328142398, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABPFVSNCDEBDTXVPFBKP5LWKKVP3ANCNFSM5NT6TVAA . You are receiving this because you commented.Message ID: @.***>