sugyan / atrium

Rust libraries for Bluesky's AT Protocol services.
MIT License
218 stars 26 forks source link

How do I convert from Cid to rkey when deleting posts? #252

Closed klausi closed 5 days ago

klausi commented 6 days ago

Hi,

I'm writing a tool to delete old Bluesky posts, code is at https://github.com/klausi/mastodon-bluesky-sync/blob/bluesky_delete_posts/src/delete_posts.rs (currently does not compile).

I fetch the users posts and save the Cids to delete them later:

async fn bluesky_fetch_post_dates(
    bsky_agent: &BskyAgent,
    cache_file: &str,
) -> Result<BTreeMap<DateTime<Utc>, Cid>> {
    let mut dates = BTreeMap::new();
    loop {
        // Try to fetch as many posts as possible at once, Bluesky API docs say
        // that is 100.
        let feed = match bsky_agent
            .api
            .app
            .bsky
            .feed
            .get_author_feed(
                bsky_sdk::api::app::bsky::feed::get_author_feed::ParametersData {
                    actor: bsky_agent.get_session().await.unwrap().did.clone().into(),
                    cursor: None,
                    filter: None,
                    include_pins: None,
                    limit: Some(LimitedNonZeroU8::try_from(100).unwrap()),
                }
                .into(),
            )
            .await
        {
            Ok(posts) => posts,
            Err(e) => {
                eprintln!("Error fetching posts from Bluesky: {e:#?}");
                break;
            }
        };

        for post in &feed.feed {
            let record = bsky_sdk::api::app::bsky::feed::post::RecordData::try_from_unknown(
                post.post.record.clone(),
            )
            .expect("Failed to parse Bluesky post record");
            dates.insert(
                record.created_at.as_ref().clone().into(),
                post.post.cid.clone(),
            );
        }
        if feed.cursor.is_none() {
            break;
        }
    }

    save_dates_to_cache(cache_file, &dates).await?;

    Ok(dates)
}

Then I try to delete them later:

// Delete old posts of this account that are older than 90 days.
pub async fn bluesky_delete_older_posts(bsky_agent: &BskyAgent, dry_run: bool) -> Result<()> {
    // In order not to fetch old posts every time keep them in a cache file
    // keyed by their dates.
    let cache_file = &cache_file("bluesky_cache.json");
    let dates = bluesky_load_post_dates(bsky_agent, cache_file).await?;
    let mut remove_dates = Vec::new();
    let three_months_ago = Utc::now() - Duration::days(90);
    for (date, post_id) in dates.range(..three_months_ago) {
        println!("Deleting Bluesky post from {date}: {:#?}", post_id);
        // Do nothing on a dry run, just print what would be done.
        if dry_run {
            continue;
        }

        let delete_result = bsky_agent
            .api
            .com
            .atproto
            .repo
            .delete_record(
                InputData {
                    repo: bsky_sdk::api::types::string::AtIdentifier::Did(
                        bsky_agent.get_session().await.unwrap().did,
                    ),
                    // @todo How can we set this as constant here to avoid parsing
                    // in each iteration?
                    collection: "app.bsky.feed.post"
                        .parse()
                        .expect("Failed to parse Bluesky collection name"),
                    rkey: post_id.into(),
                    swap_commit: None,
                    swap_record: None,
                }
                .into(),
            )
            .await;
        // @todo The status could have been deleted already by the user, ignore API
        // errors in that case.
        if let Err(e) = delete_result {
            eprintln!("Failed to delete post {:#?}: {e}", post_id);
        }
    }
    remove_dates_from_cache(remove_dates, &dates, cache_file).await
}

Compilation error:

error[E0277]: the trait bound `std::string::String: From<&Cid>` is not satisfied
  --> src/delete_posts.rs:45:35
   |
45 |                     rkey: post_id.into(),
   |                                   ^^^^ the trait `From<&Cid>` is not implemented for `std::string::String`, which is required by `&Cid: Into<_>`
   |

How can I convert a Cid into a String that is acceptable as rkey when deleting posts?

Why is a string representation missing for Cid? Would be super useful when printing out messages to not rely on the debug representation.

Thanks!

sugyan commented 5 days ago

Cid is only a hash value of the content of records and should not be related to rkey.

If you want to delete a record using delete_record, I think you should store the rkey obtained by parsing the uri, not the cid of the post. For example, in bsky-sdk, this is how I get the rkey from at-uri and delete it. https://github.com/sugyan/atrium/blob/main/bsky-sdk/src/record/agent.rs#L82

By the way, atrium_api::types::string::Cid is a wrapper for ipld_core::cid::Cid and you can get ipld_core::cid::Cid with as_ref(). This one has a string representation, so I guess we can use that.

klausi commented 5 days ago

Thanks a lot, did not realize that the URI is the ID of the post.

agent.delete_record() is a exactly what I need when I store URIs, so I don't have to deal with Cid at all.