louisgrasset / touitomamout

Touitomamout is an easy way to synchronize your Twitter's tweets 🦤 to Mastodon 🦣 and Bluesky post ☁️ (also known as Twitter to Mastodon & Bluesky crossposter)
https://hub.docker.com/r/louisgrasset/touitomamout
GNU Affero General Public License v3.0
116 stars 6 forks source link

XRPCError: This file is too large hanging up script #142

Closed JoshuaHolme closed 3 months ago

JoshuaHolme commented 9 months ago

Describe the bug Script is stuck at mirroring a post that has an image that is too large.

XRPCError: This file is too large. It is 1.32MB but the maximum size ins 976.56KB. It then stops at that tweet and won't go past it, or mirror it without the attachment.

This is the tweet in question https://x.com/ElectrekCo/status/1722329025882792287?s=20

Trying to mirror it to @electrek-mirror.bsky.social

To Reproduce Steps to reproduce the behavior:

  1. Try to mirror @ElectrekCo on twitter
  2. Have it try to mirror this tweet from November 8th

Expected behavior I expect it to either skip this tweet or post it without the image

Screenshots If applicable, add screenshots to help explain your problem.

Additional supporting information Add any other context about the problem here.

JoshuaHolme commented 9 months ago

Also getting a TypeError: Failed to parse URL from error on https://twitter.com/ElectrekCo/status/1724805155789721880

JoshuaHolme commented 9 months ago

Same typeError failed to parse URL from on this tweet as well, which seems like it's a video and should have been skipped. https://twitter.com/ElectrekCo/status/1723083248241090761

JoshuaHolme commented 9 months ago

Another Failure to parse URL. https://twitter.com/ElectrekCo/status/1725522246700052799 This one I'm not able to bypass though like the others for some reason. I put it in my cache file in hopes to skip over it but it keeps trying to grab it

Edit, never mind. The actual link was this one. Another video file. https://twitter.com/ElectrekCo/status/1725529239712391192

louisgrasset commented 9 months ago

Thanks a lot for all these useful info, I'll take a look. Currently, nothing is implemented for broadcasts (quite rare content, but I'll try to do something).

Regarding media size, I started to implement a compression service.

louisgrasset commented 9 months ago

Compression should be ok now: https://github.com/louisgrasset/touitomamout/pull/150 I'll let you test it a bit more.

I'll investigate the issue created by tweets having broadcasts

JoshuaHolme commented 9 months ago

I'll pull down the latest version. Unfortunately this account does a broadcast every Friday 🙄

louisgrasset commented 9 months ago

Some updates regarding broadcast:

 blueskyPost: {
    chunks: [
      'Podcast: Tesla Cybertruck spec leak, Volvo EX30 first drive, EV earnings, and more https://twitter.com/i/broadcasts/1vOxwjwWVomJB'
    ],
    username: 'touitomamout.bsky.social',
    replyPost: undefined,
    quotePost: undefined,
    tweet: {
      conversationId: '1723083248241090761',
      id: '1723083248241090761',
      hashtags: [],
      likes: 8,
      mentions: [],
      name: 'Electrek.co',
      permanentUrl: 'https://twitter.com/ElectrekCo/status/1723083248241090761',
      photos: [],
      replies: 1,
      retweets: 1,
      text: 'Podcast: Tesla Cybertruck spec leak, Volvo EX30 first drive, EV earnings, and more https://twitter.com/i/broadcasts/1vOxwjwWVomJB',
      thread: [],
      urls: [Array],
      userId: '2148233600',
      username: 'ElectrekCo',
      videos: [],
      isQuoted: false,
      isReply: false,
      isRetweet: false,
      isPin: false,
      sensitiveContent: false,
      timeParsed: 2023-11-10T21:00:34.000Z,
      timestamp: 1699650034000,
      html: 'Podcast: Tesla Cybertruck spec leak, Volvo EX30 first drive, EV earnings, and more <a href="https://twitter.com/i/broadcasts/1vOxwjwWVomJB">https://t.co/qLjaUM9QkH</a>',
      views: 5870
    }
  }
JoshuaHolme commented 9 months ago

Got another failure. Really weird one because it's a pretty plain tweet. https://twitter.com/engadget/status/1730278902013022687

JoshuaHolme commented 9 months ago

Another one. Same thing, pretty plain tweet it seems. https://twitter.com/engadget/status/1730301878263574598

JoshuaHolme commented 9 months ago

Is it helpful to keep posting tweets that are failing? I'm not sure what other info would be helpful

JoshuaHolme commented 9 months ago

Another failure. This time it says media type not supported (image/webpage) but doesn't continue with the rest of the tweets and doesn't skip it

https://twitter.com/teslascope/status/1730808586996175180

JoshuaHolme commented 8 months ago

@louisgrasset Are you still working on this project? Or has it fallen to the wayside due to different priorities?

louisgrasset commented 8 months ago

Nope, this issue is still getting investigated on my side.

Would it be possible to include logs in the next comments?

I see two potential issues:

To be honest I feel this issue is kind of blurry, but I don't despair of finding a solution

JoshuaHolme commented 8 months ago

Where do you want me to pull the logs from? The terminal output? Or some other file?

JoshuaHolme commented 8 months ago

Another error I'm getting is "Media type not supported (application/json;charset=utf-8)"

JoshuaHolme commented 8 months ago

Also "Media type not supported (image/webp)"

JoshuaHolme commented 8 months ago

@louisgrasset Is there any other info I can provide to help? You mentioned logs before. How best should I get those for you? Grab them from the terminal output? Or somewhere else?

louisgrasset commented 7 months ago

Hey Joshua, is there a way to get the full log when the issue occures?

JoshuaHolme commented 7 months ago

Yeah I can do that. Where is the full log? Just the terminal output? Or is it dumped somewhere else?

louisgrasset commented 7 months ago

The terminal output

JoshuaHolme commented 7 months ago

@louisgrasset here's an example output from the broadcast failure. Here is a link to the tweet that it failed on: https://twitter.com/ElectrekCo/status/1753523748727316988

josh@raspberrypi:~/Desktop/Electrek/touitomamout $ node ./dist/index.js .env
(node:8140) ExperimentalWarning: Import assertions are not a stable feature of the JavaScript language. Avoid relying on their current behavior and syntax as those might change in a future version of Node.js.
(Use `node --trace-warnings ...` to show where the warning was created)
(node:8140) ExperimentalWarning: Importing JSON modules is an experimental feature and might change at any time

Touitomamout@v1.5.0

⚙️ cache        ✔ task finished
🦤 client       ✔ connected (session restored)
☁️ client       ✔ connected
profile-sync    ✔ task finished
content-mapper  ✔ tweets: total: 13 retweets: 0 replies: 0 quotes: 0
content-mapper  ✔ task finished
content-sync    ℹ post: → generated « Podcast: Elon’s Tesla CEO... » → ☁️ 1 chunk
Error: Unable to download media:
TypeError: Failed to parse URL from 
    at mediaDownloaderService (file:///home/josh/Desktop/Electrek/touitomamout/dist/services/media-downloader.service.js:9:15)
    at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
    at async getBlueskyLinkMetadata (file:///home/josh/Desktop/Electrek/touitomamout/dist/helpers/bluesky/get-bluesky-link-metadata.js:16:23)
    at async getBlueskyChunkLinkMetadata (file:///home/josh/Desktop/Electrek/touitomamout/dist/helpers/bluesky/get-bluesky-chunk-link-metadata.js:15:24)
    at async blueskySenderService (file:///home/josh/Desktop/Electrek/touitomamout/dist/services/bluesky-sender.service.js:106:22)
    at async postsSynchronizerService (file:///home/josh/Desktop/Electrek/touitomamout/dist/services/posts-synchronizer.service.js:29:17)
    at async touitomamout (file:///home/josh/Desktop/Electrek/touitomamout/dist/index.js:21:31)
    at async file:///home/josh/Desktop/Electrek/touitomamout/dist/index.js:33:1

🦤 → 🦣+☁️
Touitomamout sync | v1.5.0
| Twitter handle: @electrekco
| 00000  ʲᵘˢᵗ ˢʸⁿᶜᵉᵈ ᵖᵒˢᵗˢ
| 01217  ˢʸⁿᶜᵉᵈ ᵖᵒˢᵗˢ ˢᵒ ᶠᵃʳ
josh@raspberrypi:~/Desktop/Electrek/touitomamout $ 
louisgrasset commented 7 months ago

From what I see, it's potentially already fixed. Try to use the latest version (1.5 is already old) and see how it goes.

If not, send me another log 🌝

JoshuaHolme commented 7 months ago

Weird I thought I was updated. Is there anything I need to do aside from git pull to update?

JoshuaHolme commented 7 months ago

I reran the build command and it updated! It fixed the broadcast issue which is awesome, there was one other account that had reproducible issues that I'll try again after work and post the log here if it doesn't work.

JoshuaHolme commented 7 months ago

Here is an error I just received on another one of my accounts. I think rerunning usually gets it to push through though

UPDATE Rerunning this a second time succeeded, but attached anyway in case it was noteworthy

josh@raspberrypi:~/Desktop/RapSheet/touitomamout $ node ./dist/index.js .env
(node:1889) ExperimentalWarning: Import assertions are not a stable feature of the JavaScript language. Avoid relying on their current behavior and syntax as those might change in a future version of Node.js.
(Use `node --trace-warnings ...` to show where the warning was created)
(node:1889) ExperimentalWarning: Importing JSON modules is an experimental feature and might change at any time

Touitomamout@v1.6.6

⚙️ cache        ✔ task finished
🦤 client       ✔ connected (session restored)
☁️ client       ✔ connected
profile-sync    ✔ task finished
content-mapper  ✔ tweets: total: 98 retweets: 0 replies: 0 quotes: 0
content-mapper  ✔ task finished
content-sync    ℹ post: → generated « “Jim Harbaugh is football... » → ☁️ 1 chunk
content-sync    ✔ ☁️ | post sent: « “Jim Harbaugh is football... »
content-sync    ℹ post: → generated « A few key second, in-pers... » → ☁️ 1 chunk
content-sync    ✔ ☁️ | post sent: « A few key second, in-pers... »
content-sync    ℹ post: → generated « A surprise! The #Panthers... » → ☁️ 1 chunk
content-sync    ✔ ☁️ | post sent: « A surprise! The #Panthers... »
content-sync    ℹ post: → generated « From The Insiders on #NFL... » → ☁️ 1 chunk
content-sync    ⚠ medias: ↯ (1/1) skipped for ☁️ bluesky : Media type not supported (video/mp4)
content-sync    ✔ ☁️ | post sent: « From The Insiders on #NFL... »
content-sync    ℹ post: → generated « From The Insiders on #NFL... » → ☁️ 1 chunk
content-sync    ⚠ medias: ↯ (1/1) skipped for ☁️ bluesky : Media type not supported (video/mp4)
content-sync    ✔ ☁️ | post sent: « From The Insiders on #NFL... »
content-sync    ℹ post: → generated « From The Insiders on #NFL... » → ☁️ 1 chunk
content-sync    ⚠ medias: ↯ (1/1) skipped for ☁️ bluesky : Media type not supported (video/mp4)
content-sync    ✔ ☁️ | post sent: « From The Insiders on #NFL... »
content-sync    ℹ post: → generated « From The Insiders on #NFL... » → ☁️ 1 chunk
content-sync    ⚠ medias: ↯ (1/1) skipped for ☁️ bluesky : Media type not supported (video/mp4)
content-sync    ✔ ☁️ | post sent: « From The Insiders on #NFL... »
content-sync    ℹ post: → generated « The #Commanders are expec... » → ☁️ 1 chunk
content-sync    ✔ ☁️ | post sent: « The #Commanders are expec... »
content-sync    ℹ post: → generated « The #Falcons completed th... » → ☁️ 1 chunk
content-sync    ✔ ☁️ | post sent: « The #Falcons completed th... »
content-sync    ℹ post: → generated « The #Falcons were the onl... » → ☁️ 1 chunk
content-sync    ✔ ☁️ | post sent: « The #Falcons were the onl... »
content-sync    ℹ post: → generated « From @NFLTotalAccess: Rah... » → ☁️ 1 chunk
content-sync    ⚠ medias: ↯ (1/1) skipped for ☁️ bluesky : Media type not supported (video/mp4)
content-sync    ✔ ☁️ | post sent: « From @NFLTotalAccess: Rah... »
content-sync    ℹ post: → generated « Is Bill Belichick set to ... » → ☁️ 1 chunk
content-sync    ⚠ medias: ↯ (1/1) skipped for ☁️ bluesky : Media type not supported (video/mp4)
content-sync    ✔ ☁️ | post sent: « Is Bill Belichick set to ... »
content-sync    ℹ post: → generated « From @nflnetwork: New #Fa... » → ☁️ 1 chunk
content-sync    ⚠ medias: ↯ (1/1) skipped for ☁️ bluesky : Media type not supported (video/mp4)
content-sync    ✔ ☁️ | post sent: « From @nflnetwork: New #Fa... »
content-sync    ℹ post: → generated « Former #Bears OC Luke Get... » → ☁️ 1 chunk
content-sync    ✔ ☁️ | post sent: « Former #Bears OC Luke Get... »
content-sync    ℹ post: → generated « From @GMFB: Watching key ... » → ☁️ 1 chunk
content-sync    ⚠ medias: ↯ (1/1) skipped for ☁️ bluesky : Media type not supported (video/mp4)
content-sync    ✔ ☁️ | post sent: « From @GMFB: Watching key ... »
content-sync    ℹ post: → generated « From @GMFB: The #Panthers... » → ☁️ 1 chunk
content-sync    ⚠ medias: ↯ (1/1) skipped for ☁️ bluesky : Media type not supported (video/mp4)
content-sync    ✔ ☁️ | post sent: « From @GMFB: The #Panthers... »
content-sync    ℹ post: → generated « The #Eagles are interview... » → ☁️ 1 chunk
content-sync    ✔ ☁️ | post sent: « The #Eagles are interview... »
content-sync    ℹ post: → generated « Now official: The #Ravens... » → ☁️ 1 chunk
content-sync    ✔ ☁️ | post sent: « Now official: The #Ravens... »
content-sync    ℹ post: → generated « #Rams QBs coach and pass ... » → ☁️ 2 chunks
content-sync    ✔ ☁️ | post sent: « #Rams QBs coach and pass ... » (2 chunks)
content-sync    ℹ post: → generated « From The Insiders on #NFL... » → ☁️ 1 chunk
content-sync    ⚠ medias: ↯ (1/1) skipped for ☁️ bluesky : Media type not supported (video/mp4)
content-sync    ✔ ☁️ | post sent: « From The Insiders on #NFL... »
content-sync    ℹ post: → generated « From The Insiders on #NFL... » → ☁️ 1 chunk
content-sync    ⚠ medias: ↯ (1/1) skipped for ☁️ bluesky : Media type not supported (video/mp4)
content-sync    ✔ ☁️ | post sent: « From The Insiders on #NFL... »
content-sync    ℹ post: → generated « From The Insiders on #NFL... » → ☁️ 1 chunk
content-sync    ⚠ medias: ↯ (1/1) skipped for ☁️ bluesky : Media type not supported (video/mp4)
content-sync    ✔ ☁️ | post sent: « From The Insiders on #NFL... »
content-sync    ℹ post: → generated « The #Packers have intervi... » → ☁️ 1 chunk
content-sync    ✔ ☁️ | post sent: « The #Packers have intervi... »
content-sync    ℹ post: → generated « The #Patriots plan to int... » → ☁️ 1 chunk
content-sync    ✔ ☁️ | post sent: « The #Patriots plan to int... »
content-sync    ℹ post: → generated « The #Chiefs have ruled ou... » → ☁️ 1 chunk
content-sync    ✔ ☁️ | post sent: « The #Chiefs have ruled ou... »
content-sync    ℹ post: → generated « The #Panthers have reques... » → ☁️ 1 chunk
content-sync    ✔ ☁️ | post sent: « The #Panthers have reques... »
content-sync    ℹ post: → generated « As often happens when new... » → ☁️ 1 chunk
content-sync    ✔ ☁️ | post sent: « As often happens when new... »
content-sync    ℹ post: → generated « The #Falcons have request... » → ☁️ 1 chunk
content-sync    ✔ ☁️ | post sent: « The #Falcons have request... »
content-sync    ℹ post: → generated « #Texans QBs coach Jerrod ... » → ☁️ 1 chunk
content-sync    ✔ ☁️ | post sent: « #Texans QBs coach Jerrod ... »
content-sync    ℹ post: → generated « The #Chiefs have downgrad... » → ☁️ 1 chunk
content-sync    ✔ ☁️ | post sent: « The #Chiefs have downgrad... »
content-sync    ℹ post: → generated « Former #Eagles DC Sean De... » → ☁️ 1 chunk
content-sync    ✔ ☁️ | post sent: « Former #Eagles DC Sean De... »
content-sync    ℹ post: → generated « The #Lions elevated FB Ja... » → ☁️ 1 chunk
content-sync    ✔ ☁️ | post sent: « The #Lions elevated FB Ja... »
content-sync    ℹ post: → generated « Sources: The #Falcons are... » → ☁️ 1 chunk
content-sync    ✔ ☁️ | post sent: « Sources: The #Falcons are... »
content-sync    ℹ post: → generated « The #Eagles are expected ... » → ☁️ 1 chunk
content-sync    ✔ ☁️ | post sent: « The #Eagles are expected ... »
content-sync    ℹ post: → generated « The #Seahawks plan to int... » → ☁️ 2 chunks
content-sync    ✔ ☁️ | post sent: « The #Seahawks plan to int... » (2 chunks)
content-sync    ℹ post: → generated « #Chiefs All-Pro guard Joe... » → ☁️ 1 chunk
content-sync    ✔ ☁️ | post sent: « #Chiefs All-Pro guard Joe... »
content-sync    ℹ post: → generated « From @NFLGameDay: The #Li... » → ☁️ 1 chunk
content-sync    ⚠ medias: ↯ (1/1) skipped for ☁️ bluesky : Media type not supported (video/mp4)
content-sync    ✔ ☁️ | post sent: « From @NFLGameDay: The #Li... »
content-sync    ℹ post: → generated « The Insiders on @NFLGameD... » → ☁️ 1 chunk
content-sync    ⚠ medias: ↯ (1/1) skipped for ☁️ bluesky : Media type not supported (video/mp4)
content-sync    ✔ ☁️ | post sent: « The Insiders on @NFLGameD... »
content-sync    ℹ post: → generated « From @NFLGameDay: #49ers ... » → ☁️ 1 chunk
content-sync    ⚠ medias: ↯ (1/1) skipped for ☁️ bluesky : Media type not supported (video/mp4)
content-sync    ✔ ☁️ | post sent: « From @NFLGameDay: #49ers ... »
content-sync    ℹ post: → generated « From @NFLGameDay: Will Ta... » → ☁️ 1 chunk
content-sync    ⚠ medias: ↯ (1/1) skipped for ☁️ bluesky : Media type not supported (video/mp4)
content-sync    ✔ ☁️ | post sent: « From @NFLGameDay: Will Ta... »
content-sync    ℹ post: → generated « The Insiders on @NFLGameD... » → ☁️ 1 chunk
content-sync    ⚠ medias: ↯ (1/1) skipped for ☁️ bluesky : Media type not supported (video/mp4)
XRPCError: TypeError: fetch failed
    at _AtpAgent.defaultFetchHandler [as fetch] (/home/josh/Desktop/RapSheet/touitomamout/node_modules/@atproto/api/dist/index.js:15673:11)
    at async BskyAgent._fetch (/home/josh/Desktop/RapSheet/touitomamout/node_modules/@atproto/api/dist/index.js:27981:15)
    at async ServiceClient.call (/home/josh/Desktop/RapSheet/touitomamout/node_modules/@atproto/api/dist/index.js:15634:17)
    at async PostRecord.create (/home/josh/Desktop/RapSheet/touitomamout/node_modules/@atproto/api/dist/index.js:27526:17)
    at async blueskySenderService (file:///home/josh/Desktop/RapSheet/touitomamout/dist/services/bluesky-sender.service.js:170:9)
    at async postsSynchronizerService (file:///home/josh/Desktop/RapSheet/touitomamout/dist/services/posts-synchronizer.service.js:29:17)
    at async touitomamout (file:///home/josh/Desktop/RapSheet/touitomamout/dist/index.js:21:31)
    at async file:///home/josh/Desktop/RapSheet/touitomamout/dist/index.js:33:1 {
  status: 1,
  error: 'TypeError: fetch failed',
  success: false,
  headers: undefined
}

🦤 → 🦣+☁️
Touitomamout sync | v1.6.6
| Twitter handle: @rapsheet
| 00000  ʲᵘˢᵗ ˢʸⁿᶜᵉᵈ ᵖᵒˢᵗˢ
| 00019  ˢʸⁿᶜᵉᵈ ᵖᵒˢᵗˢ ˢᵒ ᶠᵃʳ
josh@raspberrypi:~/Desktop/RapSheet/touitomamout $ 
louisgrasset commented 7 months ago

Yes, after pulling the project (or its update), you have to run install & build commands

See docs here: https://louisgrasset.github.io/touitomamout/docs/configuration/manual-sync#installation

JoshuaHolme commented 7 months ago

Yep, I rebuilt and ran it again. The second log I sent worked fine after I ran it again following the error. Bellow is an example of a failure because of an image/webp. This is based off of automated tweets, an example of which can be found here. https://twitter.com/teslascope/status/1752424757717504468

It seems like it's only a link that's attached, no image other than the link card, which I guess could be causing the issue. The script also stops after this failure, and doesn't continue on. It's not like other failures where it skips that tweet and continues to the next

josh@raspberrypi:~/Desktop/Teslascope/touitomamout $ node ./dist/index.js .env
(node:4490) ExperimentalWarning: Import assertions are not a stable feature of the JavaScript language. Avoid relying on their current behavior and syntax as those might change in a future version of Node.js.
(Use `node --trace-warnings ...` to show where the warning was created)
(node:4490) ExperimentalWarning: Importing JSON modules is an experimental feature and might change at any time

Touitomamout@v1.6.6

⚙️ cache        ✔ task finished
🦤 client       ✔ connected (session restored)
☁️ client       ✔ connected
profile-sync    ✔ task finished
content-mapper  ✔ tweets: total: 10 retweets: 0 replies: 0 quotes: 1
content-mapper  ✔ task finished
content-sync    ℹ post: → generated « We noticed a new Tesla so... » → ☁️ 1 chunk
Media type not supported (image/webp)

🦤 → 🦣+☁️
Touitomamout sync | v1.6.6
| Twitter handle: @teslascope
| 00000  ʲᵘˢᵗ ˢʸⁿᶜᵉᵈ ᵖᵒˢᵗˢ
| 00005  ˢʸⁿᶜᵉᵈ ᵖᵒˢᵗˢ ˢᵒ ᶠᵃʳ
louisgrasset commented 7 months ago

From what I saw in the codebase, it was missing an error handler, leading the whole sync to blow up.

Since, for now, I don't see any reason why it was failing for this specific tweet (I synced successfully on a test account), I'll let you try to use the latest code version (either by pulling the repo + install + build OR by using the latest dev docker image.

Please note the version will not be bumped since no new release will be done for this specific change.

Releases: https://github.com/louisgrasset/touitomamout/releases

Let me know how it goes with the fix. If you still get issues, please, send (again lmao) the logs, we'll try to understand the issue.

Also, to answer to your previous message, it is expected not to get the post synced to bluesky if the media is not supported. To get more information, you can rely on DEBUG env variable to get more logs.

JoshuaHolme commented 7 months ago

Sorry, I've been combining issues into this one and now I've confused myself. There were two issues I talked about

  1. @RapSheet which was the sync that had XRPCError: TypeError: fetch failed that after I reran it ran without issue
  2. @Teslascope which would hit Media type not supported (image/webp)

In regards to #2, I get that the unsupported media isn't expected to sync to bluesky, but the sync stops after that. It doesn't skip the post and continue with the other posts (which this account has). It kind of hits it and gets dead in the water. What I would expect is that when it hits the posts with image/webp it skips it like it does video and other unsupported types, and then moves to the next tweet in the queue

JoshuaHolme commented 5 months ago

@louisgrasset any luck looking into this?

louisgrasset commented 3 months ago

Hello, regarding the two issues we were able to detect:

  1. The image compression is now effective
  2. The image/webp mime type is not allowed

I'll close the issue, let me know if you still face issues.