Open romu70 opened 5 years ago
As we talked about on Gitter.im: the Researcher is probably more of an Aggregator that uses wallabag to fetch multiple content before writing something from it (i.e. a student, a blog writer, maybe a teacher). S/he needs to organize (tags), search content & annotate it.
For a Researcher (in the academic meaning), wallabag would need great improvements in metadata for reference management and content extraction from PDFs.
Let me bring up another archetype, which describes a part of my workflow.
Usage: for everything
Instance: own server
Also uses: archive.org, archive.is, Zotero
Behavior:
"Wallabag is the place I store all the important links I find on the internet."
NEEDS:
The archivist needs to keep his data forever, searchable and organized. This ensures that important data is not lost when the original site is destroyed because of neglect, censorship or catastrophes.
What is important is that the archived sites are faithfully reproduced and accessible reliably in the future. Making it easier to find specific topics and URLs is critical, and having the archive searchable is a plus.
The archivist, like the researcher, is expert at data selection and management. Free software and privacy are important, but not an absolute.
They might also be storing data on Archive.org or Archive.is for redundancy purposes, but do not trust those services to be around forever and prefer to also store data locally. They also use Zotero to index their physical media collection.
WHAT COULD MAKE A DIFFERENCE
The archivist would love more reliability in the article extraction process. Failing that, an exact copy of websites should be available, preferably as a WARC file, the standard for web archives. Saving non-HTML URLs (images, PDFs, etc) would also be important.
Being able to cross-reference tags would be a plus, to search on certain topics. The existing URL-based search is good, but allowing for multiple versions of the same URL would be useful to be able to go back in history.
To be clear, I individually probably fall in all of the archetypes myself:
So I'm not just an archivist, but I figured it would be useful to elaborate on that use case, following the discussion in https://github.com/wallabag/wallabag/issues/3697#issuecomment-443216052. If relevant, I'd be happy to open specific feature requests for the missing features or point to already existing issues.
We were talking with @romu70 and there was a suggestion to rename the Researcher in Aggregator.
Could you elaborate on why you would use wallabag (designed to show articles in a better way) instead of, for example, Zotero, which is actually made to store references, keep versions, cross-reference and tags documents, do webpages snapshots and so on.
On 2018-12-03 11:31:17, Eloi Coutant wrote:
We were talking with @romu70 and there was a suggestion to rename the Researcher in Aggregator.
Could you elaborate on why you would use wallabag (designed to show articles in a better way) instead of, for example, Zotero, which is actually made to store references, keep versions, cross-reference and tags documents, do webpages snapshots and so on.
Zotero has a very limited web interface, passable desktop client, it's hard to install (or impossible) a server version. The desktop client, in particular, is based on an unsupported Firefox version and is bound to be destroyed in a cataclysmic event in the future. I'm looking for a way out.
By default, it only keeps a reference of web pages and doesn't save a copy.
It also does not have a mobile app.
Just reworked the archetypes. I added the archivist, a bit modified. But I also reworked the Blogger (instead of Aggregator and Researcher). Available in the archetypes folder.
Sorry for reactivating such an old issue, but I think it gets more relevant every day. My bookmarks are falling apart more and more. I use ArchiveBox with a script to fetch new bookmarks from wallabag, which will then archived in ArchiveBox. But this is so clunky and barely usable. Wallabag will make a copy of the content of the page, but a lot of times, this mechanism fails and I end up with a messed up wall of text (or no text at all). This is fine, parsing modern websites is hard. But then I am out of options: Most of the time, the website is long gone, end in a 404 error or on a domain grabber landing page. Accessing the webpage in the state I bookmarked it is essential for me. So to access the real content, I first head over to archive.org, copy/paste the URL, hope there is a backup, then switch to my ArchiveBox installation, search for the URL here, ... This is all cumbersome and should be easier; And part of Wallabag!
Link rot is getting more and more of a problem. If you bookmark a link today, chances are at 20% that the link does not work any more in a year. 50% on 5 years. Additionally, a lot of links nowadays can't fetched/parsed by wallabag any more. So Wallabag is going to loose its use case :(
To prevent this, @anarcat suggested a perfect solution for that (WARC), which is IMHO the best way out. Additionally, It would be awesome if wallabag can optionally trigger a snapshot on archive.org (+a link in the UI for faster access).
What are the remaining concerns here? For the implementation part, I can help with that.
Feel free to read this article: https://medium.com/@romu70/personas-and-user-archetypes-a-case-study-with-wallabag-c8405718d715
This is the discussion to talk about the archetypes located here: https://github.com/wallabag/design/tree/master/archetypes.