kensanata / mastodon-archive

Archive your statuses, favorites and media using the Mastodon API (i.e. login required)
https://alexschroeder.ch/software/Mastodon_Archive
GNU General Public License v3.0
358 stars 33 forks source link

Coloured text search? #56

Open ashwinvis opened 4 years ago

ashwinvis commented 4 years ago

The only thing I can do now is by piping the output into | pygmentize -s -l md, but it is not colourful enough :)

kensanata commented 4 years ago

Do you know a Python library that does the right thing depending on whether stdout is a pipe or a terminal? And what exactly would we highlight and how?

ashwinvis commented 4 years ago

Pygments it the best package that I know of to colourize based on syntax. If you want to colourize manually there is colorama

As I see it there are two alternative ways to go about it:

  1. properly format markdown and syntax highlight it, or...
  2. convert markdown into plain text into a format which mimics Mastodon timeline (similar to TUI applications like toot) and display it on terminal.
ashwinvis commented 4 years ago

A completely different route would be to add a search box in the html export: like sphinx or even better, make an markdown export compatible with sphinx.

kensanata commented 4 years ago

I have no clue how all of that would work (and across multiple local HTML files). The part I understand is piping the result of a search through mdcat or the like. The problem as far as I can tell is simply the ANSI escape codes as soon as you're not in a pager: mastodon-archive text kensanata@octodon.social oddmuse | mdcat | head | less

image

kensanata commented 4 years ago

Another thing I like: mastodon-archive text kensanata@octodon.social oddmuse | kramdown | w3m -T text/html … I guess what I'm saying is that I'm not going to spend a lot of time on this. If you have a patch ready to go?

ashwinvis commented 4 years ago

If piping works, then we don't need to implement this in the text search command. I guess the first step before one does this would be to properly format markdown, which lacks structure now.

cutiful commented 2 years ago

@ashwinvis

A completely different route would be to add a search box in the html export

You could try using the new Meow integration! Here's the readme section that explains how to use it. It has full-text search and displays posts as HTML, so links are links, etc. Looks like this:

a screenshot showing the interface

ashwinvis commented 2 years ago

I am a bit hesitant to use it. A locally hosted open-source solution would be nice to have. Depending on the interface either

would be a good fit.

cutiful commented 2 years ago

@ashwinvis that's fair. Would you mind answering a few questions? If you don't want to respond, just ignore this message.

Is your preference for a locally hosted application a privacy concern? Because if that's the case, I'm thinking to add an option to verifiably block any network requests for Meow. It could be done in a variety of ways, for example, using a third-party open-source browser extension (uBlock or any other ad blocker could be configured to do it). Or a simple bookmarklet you could verify the code for, or an offline mode which would allow you to just unplug the internet cable. This way, you could open Meow, turn off the internet (only for Meow or completely), import your archive and use it. Then afterwards erase the data in your browser settings and turn the internet back on. In this setup, there would provably be no way for Meow to leak information.

If that was available (and easy enough to set up), would you consider using it? If not, why?

(Hope kensanata doesn't mind this semi-offtopic discussion...)

ashwinvis commented 2 years ago

Is your preference for a locally hosted application a privacy concern?

Yes. That's correct.

It could be done in a variety of ways, for example, using a third-party open-source browser extension (uBlock or any other ad blocker could be configured to do it). Or a simple bookmarklet you could verify the code for, or an offline mode which would allow you to just unplug the internet cable. This way, you could open Meow, turn off the internet (only for Meow or completely), import your archive and use it. Then afterwards erase the data in your browser settings and turn the internet back on. In this setup, there would provably be no way for Meow to leak information.

An offline mode seems promising! I doubt if uBlock Origin can block network requests, but it might be possible using HTTP CORS and CSP. Of course, it should be possible to verify it too. Another (platform-dependant) alternative would be to set up firewalls.

If that was available (and easy enough to set up), would you consider using it? If not, why?

Yes, then I would use it.

cutiful commented 2 years ago

Thanks for the feedback! I looked into it and figured that a full-on offline mode adds a lot of complexity for little gain. But ad blockers definitely can block network requests, and a solution based on that seems pretty good to me: it's accessible (a lot of people have Adblock Plus or uBlock Origin installed already), easy to verify and persistent (you don't need to remember to remove the data, because as long as you have the blocklist added, requests are blocked).

For this to work, the rules have to allow requests to the website itself (to load HTML/JS/CSS). But it's hosted on Neocities, a static hosting which doesn't allow server-side code. That would be necessary to transfer any data to it. I think it should be fine?

I made this blocklist: View / Add to ABP or uBO. It consists of two rules, everything is documented in the file itself. Please let me know if this addresses your concerns. If yes, I'm going to publish it for all users after some more extensive testing.

IzzySoft commented 1 year ago

I use to pipe the output to lynx -stdin (after converting it from markdown to html using the markdown command). Also not exactly as colorful as you like, but allows not only for paging but also to follow links directly. It's btw integrated with the contrib/mastosearch script where you can set up your preferences via a config (was just added with v1.4.3 today).