twindle-co / twindle

Twindle - an open source project for beginners. Converting twitter threads to pdf, epub, and mobi format to be read by Kindle.
MIT License
134 stars 133 forks source link

Some edge cases and possible fixes which must be tested #768

Closed tr0mbl3y closed 3 years ago

tr0mbl3y commented 3 years ago

some cases that i noted:

  1. in transformation/helper.js file problem: this regex---> /https?:\/\/\/[a-zA-Z_]{1,20}\/status\/([0-9]*)/g--> will give empty array when passed with url having username in digits : possible solution: a): /https?:\/\/\/[a-zA-Z_0-9]{1,20}\/status\/([0-9]*)/g-->do this or b) a direct method like regex: /[\d+]{10,}/g can also extract id and cases like this--> (username with only digit) can also be handled by this.

  2. in this file: Validation/tweet_endpoint.js if the difference of current time and tweet created somehow evaluates to 7 days it might give unnecessary error of tweet older than 7 days problem here : -->return differenceInDays > 7 possible solution: a) return differenceInDays >= 7 b)other thing is: can we use Math.floor here like in many cases i noted that time evaluates to floating numbers so Math.floor operator will make values like6.9~6 but here issue will be if it is somehow 6.9....16times ~ 7 [depends on IDE we are using i guess] it will be evaluate to true : see

console.log(Math.floor(6.999999999999999)); result: 6
console.log(Math.floor(6.9999999999999999)) result: 7 [tested on Mozilla developer IDE] --> i have no idea about this above mentioned result look around and please let the team know what u found

  1. in this filetransformation/rich_rendering.js : this function

    for (let x of mediaKeys) { 
    const mediaInfo = expandedMediaIncludes.find(({ media_key }) => media_key === x);

    ---->should it be {media_keys} ?? please let me know in the comments

  2. in file Scraping/index.js: this function

    const showRepliesButton = [...document.querySelectorAll('div[dir="auto"]')]
      .filter((node) => node.children[0] && node.children[0].tagName === "SPAN")
      .find((node) => node.children[0].innerHTML === "Show replies");
    if (showRepliesButton) {;
      await waitFor(2000);

is essentially searching for show Replies button and clicking it . i am assuming it is searching for Show Reply only one time [correct me if i am wrong] this might be the reason that we are unable to fetch longer thread with 100+ tweets i guess. @Mira-Alf mentioned in issue #728 that she is not receiving full tweets.

possible solution: a loop so that if ShowReply is found multiple times it will keep on clicking and fetch results. as you guys mentioned in the meet.

NOTE: please test these

johnjacobkenny commented 3 years ago

@PuruVJ @Mira-Alf identify the issues, and create separate issues so someone else can pitch in. If nobody steps up in 2 or 3 days, then either of you can take it up

tr0mbl3y commented 3 years ago


issue 1 and 2 are fixed thanx @twindle-co/developer i was completely wrong about point number- 3 it is indeed { media_key } 4 points i am still searching

PuruVJ commented 3 years ago

I think we may fix the 3rd one by clicking using while loop but we'd also need to set a limit of 2 to 3 ( as we simply can't put in more than 100 ids for checking, even when all those IDs may not be of that thread itself.

You know it better @Mira-Alf. What do you think?