security-force-monitor / sfm-cms

Platform for sharing complex information about security forces. Powers WhoWasInCommand.com
https://whowasincommand.com
10 stars 3 forks source link

General page titles, improve page metadata, and miscellaneous SEO improvements #357

Open tlongers opened 6 years ago

tlongers commented 6 years ago

Currently the title of all pages is: <title>WhoWasInCommand</title>

Open question is whether it would it be beneficial in some way to include the record name in the page title, so it displays in tabs/bookmarks, may help with external search indexing?

jeancochrane commented 6 years ago

Yeah, I think this is a good idea. Would help if you have a bunch of different detail views in your browser history, too.

tlongers commented 6 years ago

Possibly integrate enrichment features like Twitter cards too.

tlongers commented 6 years ago

Oooh I love this issue :)

tlongers commented 3 years ago

We have had a few bouts wrestling with aggressive crawlers (see #722, #489 for detail). Pinning a few additional thoughts to this issue:

hancush commented 3 years ago

Bumping this thread, per our conversation this week, @tlongers.

hancush commented 3 years ago

Mission log: Disabled crawling of the search page and added a site map including all translations of static pages and the English version of entity pages (because the relevant search terms – name, locations, units, etc. – do not vary across languages, and this makes more economical use of our crawl budget). Adding dynamically populated page title and description meta tags next.

Another thing that I noticed is that source links are not included in the page source. Outbound links can be beneficial to SEO, so it may be worth investigating what it would take to load them in so they can be crawled. I suspect but am not certain it's an efficiency thing. If that's the case, caching could help. Related to #752.

tlongers commented 3 years ago

Thank you for the update.

As you say, sources used for any specific data point are (I think) only loaded when the user activitates the citation widget for that data point. So the sources are not contained anywhere that a crawler could access them. The two immediate propositions seem to be:

One thing that we have considered doing is introducing a "Evidence/sources used to make this record" section on the page. In addition to showing things at the level of an individual datapoint, as we do now, the user could also just see a list of all the sources pertaining to everything that page, listed in a single table. This may be a more intuitive way to help the user see where we get data from overall, and provide a cleaner method of showing a page's outlinks.

hancush commented 3 years ago

Oh, that's a really good idea, @tlongers. I threw together a sources section on the Person page to assess viability. Seems doable to replicate on the other entity pages, as well! You can check it out on the staging site, e.g., https://back.securityforcemonitor.org/en/person/view/dab3a8eb-ac42-45f2-8810-14609c0ecbcc/#sources. Let me know what you think?

tlongers commented 3 years ago

Thanks @hancush This is good - I like it:

I'll outline some general thoughts on this below.

What is the appropriate level of detail to include in this table?

The columns included identify the sources, but not the access points that were actually used. The "access point" mechanism is what we use to specify the exact part of a source that we are using to evidence a datapoint. The most intuitive examples of access points are the page of a document source, or the start and end time in a video clip. Less intuitive are archive access points, by which we mean "this source as archived at this specific point". In some cases, where the source itself is no longer online at its original URL, the archive is the only way we can access it. To include data of this nature, we'd need to design some logic around the interplay between source:access_point_type and source:access_point_trigger.

It's also worth pointing out that the citation popups don't include access points either, so if we did increase the richness of the data in the proposed "sources" table it'd be worth revisiting the content of the citations popovers as well in order to provide some consistency. For some history on the cituation widget check out the design research doc on the wiki, and discussion between JC and I in #273 and also #443, which is thread on the design of the "source picker" in the (now defunct) data entry system).

I wonder what the overall effect of having two ways of accessing sources is for users though. Does it clarify or confuse? In working on this, we're revisting an old Internet problem of handling citations in situations where just a hyperlink is just not precise enough, and academic-style endnote/footnote citation styles are just inappropriate. Our situation - the citation popover - is as good as anything else out there, but I still think it's a a rarified feature and not widely used by site visitors (we don't track this in Matomo, however, though we perhaps we should). We need to think about ways of signposting these features a little more.

What would it look like for larger records?

Can you hack together a version of it for a massive record like this one, which has over 500 datapoints displayed. I suspect it will show 50-100+ rows in the sources table. Would the source table get unwieldy enough that we might want to include some kinds of sorting and filtering on them? If we do it for thse tables, should we perhaps do it for all the others as well?

Harmonising this with bringing making sources available in WWIC more generally

We have an open issue (#752) proposing to make full source metadata available to all users, so they can search for sources and see how we have used them. If we want to do this, what is the interplay between the sources table in an entity record and the record for a source (which include a reverse view of the records/fields in which it is used)? I don't have an answer to this.

@tonysecurityforcemonitor What's your take on this?

hancush commented 3 years ago

This is really wonderful context, @tlongers, and I appreciate this opportunity to pause and reflect on / define the purpose of each way we want to expose sources, because they really enrich this dataset and represent a massive effort on your end that deserves visibility!

Here's what I'm thinking for the source elements, as of now. Interested in yours and @tonysecurityforcemonitor's thoughts/reactions! These comments are getting to be long, so I'd be happy to hop on a call next week if it would help us arrive at an immediate path forward more expediently.

Inline citations

I pushed a quick and dirty idea of what these design changes (sans adding the access point to the popover) could look like. Check it out on the staging site, e.g., https://back.securityforcemonitor.org/en/person/view/013dd83b-b3e6-4260-a0cc-4ed48e771e0f/

Sources list

Source detail page (#752)

And a bonus thought

I think that making the sources more prominent through the listing, and later by making them first class entities, will prominently advertise that sources exist and clarify what the sources actually are. With this context, users may engage more with the inline citations because they know what kind of information they contain. The sources also support the facts of the data, and there may be some indirect benefits through increased visibility, e.g., strengthened perception of trust boosts engagement across the site.

tonysecurityforcemonitor commented 3 years ago

Inline citations Who are they for? User interested in evidence for a particular data point What should they include? Source summaries and access points Design notes Add access point to tooltips. If someone calls up the source for a particular fact, we should give them the information they need to corroborate it in the source. Consider toggles by default. Hover navigation is hidden by design. It's also not great for accessibility or mobile users. I'd recommend moving to exposing citations by default. Label citation toggles with something other than numbers. Numbering a citation, to me, suggests that it corresponds to a reference elsewhere. Fact to sources is a one-to-many relationship, so this is a bit confusing. I think the addition of a source list will exacerbate this tension. The design wiki offers an example where citations are indicated by a plus sign. This could be a good solution for us, as well! Remove confidence background color from citation toggles. There isn't a key anywhere indicating what it means, and it reduces visual clutter when showing citation toggles by default. I pushed a quick and dirty idea of what these design changes (sans adding the access point to the popover) could look like. Check it out on the staging site, e.g., https://back.securityforcemonitor.org/en/person/view/013dd83b-b3e6-4260-a0cc-4ed48e771e0f/

All of these ideas resonate with me, and you've really laid it out nicely I think. Nothing to add all sounds good!

Sources list Who are they for? User interested in information about an entity in general (Also good for building external link network) What should they include? Source summaries, with through link to source detail page Design notes I don't think it makes sense to include access point here, because sources will be separated from the specific facts they evidence. @tlongers, re: sorting and filtering, thinking of the purpose the source list is serving, what kind of sorting / filtering would help users find what they need? Perhaps sorting by date, filtering by publication source (e.g., show me the newest sources from the United Nations)? Will this functionality become redundant with source search, as implemented in #752? Can we organize static data in a way that Ctrl+F could support discovery? Here's a person record with a big ol' list of sources: http://back.securityforcemonitor.org/en/person/view/af9278e9-f99d-4051-9bad-96dab371ce4d/

Seeing the mockups of the source list has been quite helpful - I suspect this will be the place that most users will look at to understand what we've used to build our data. I think because it resembles how/where sources are placed in other sites, particularly Wikipedia, it will be a natural place for users to look. I also think it ties nicely to most users questions which are "what are your sources for a particular unit, person, etc."

The access point issue is tricky... I can understand that we're trying to balance showing users a more immediate and digestible view of the sourcing because the in-depth look is provided by inline citations. And in many cases the mockup view makes sense. The biggest potential pain point in my mind would be Incidents, where the access point provides a useful guide for the user of where in a lengthy human rights report we found a particular incident. Somewhat related would be when we make use of books or other paginated sources, mainly in unit records. I'm thinking here of Middle East Air Power in the 21st Century and Building the Tatmadaw : Myanmar armed forces since 1948. Less common would be cases like https://back.securityforcemonitor.org/en/organization/view/fffddda2-24a5-463b-b305-780b2e9a3a90/ where for sourcing for the area of operations the actual content of a page changed through time, even if the url did not, so the various archive access points are actually important to understand the area of operations of the unit through time. But this is all mitigated by the fact we do have all of the information in the inline citations.... however, then it is complicated by the fact that the inline citations would differ from the Sources List, so in some ways the Sources List isn't truly a sources list.... Perhaps a simple solution would be renaming it "Sources Overview" or something else to flag that this is a summation of sourcing which is more detailed in the inline citations. Or... am I overthinking this?

Re: filtering... I'm not sure, perhaps it makes sense to build in that functionality from the start, though I'm not sure if users would use it, but perhaps having more tools rather than less is good.

Source detail page (#752) Who are they for? User interested in structured information from a particular source (Also good for building internal link network) What should they include? Source summary and access point detail, with through links to all facts supported by the relevant source, coupled with the corresponding access point

This sounds good :)

hancush commented 3 years ago

All great points, @tonysecurityforcemonitor. Will respond more to the weeds of the question you raise about access points in the source list, but I wanted to flag that source lists have been added to all three types of entity page, and I've added basic sorting and filtering with a lovely lightweight library called DataTables, which we've used quite a bit in other projects.

Also noting for myself that some of the larger entity pages were sluggish before and are a bit more sluggish with sources. I've added 24 hour caching to entity pages, and I'd like to think on expanding that timeout, since the data isn't changing regularly. (We'd, of course, bust the cache when we uploaded new data.)

This is all deployed to the staging site!

tlongers commented 3 years ago

@hancush the update to the sources table is great to see; thank you. The UX is simple and snappy, and the search/filtering function from DataTables is a nice way to invite exploration of the data. Outside of the current context, we should explore whether this should be wired into every table, including the main entity searches. The impact of the caching is positive too - as you say, big pages that just got bigger load in a snap now.

Seeing this work definitely helps us move this area of WWIC forward a bit. I've been going around in circles a bit about the direction to take with how we weave sources into the content. I started by looking at the citation system that exists on WWIC pages. I've also gone back over our experiences with how we dealt with source creation and selection during data creation (for example, on the unit creation page, for the logged in user) and also the view of data shown when we cued up a specific access point (like this one for access point 907e19-680b-4907-a063-5ce1748ab396 (login needed)).

I think that we're providing functionality that serve two extremes:

So what do we do? The new "sources" block is a first attempt at implementing somethign in the middle ground here, but it may still be too vague to meet a solid need. Yes, it shows the evidence base; no, it's still not that useful in fact-checking, or other needs users may have to bring data into their own work. The questions this leads me to are "what are people trying to find out about?" and "what do they want to know about it? I reckon it's more about groupings of data that read together form a comprehensible statement. For example, here's the for of areas of operation for Operación Conjunta Nuevo León - Tamaulipas:

image

The need here is "show me why you say the unit was based there between those dates". The answer is provided by showing all the sources used for that specific row. Rather than putting it in a citation widget, this makes it worth generating a page specific to those sources for that grouping of datapoints. Similarly, from the same record, a row about memberships units:

image

The answer to the question of "what do they need to know about it?" is .. everything: full source and access point metadata, with all relevant links. My hunch is that citations for a chunk of data are more meaningful to users than citations for either individual fields or the complete dataset. The good thing is that we've actually done the hard work here of presenting meaningful groupings of data: we just need to leverage it better.

This is a bit of departure, and I'm sure is more technically involved. My thinking on how it might play out would be to take this route.

I'll stop here and pass the mic on...

hancush commented 3 years ago

@tlongers @tonysecurityforcemonitor We've all written quite a bit, and I really appreciate the conversation, especially as I continue to get up to speed on project history. The concept of sourcing for a "data chunk" is really compelling, and I'd love to develop that more with you! Are you two available to talk synchronously about higher level questions of sourcing?

It stands out to me that we all offered guesses as to whether and how people are using existing functionality. This is knowable: We can capture events corresponding to citation and source list interactions in Matomo. I think this would provide really helpful signal as we think on and ultimately make decisions about how to expose sources.

Meanwhile, I took a look at Matomo this morning and saw that search engines account for nearly 90% of referrals. The pending changes from this issue should positively effect search coverage and the appearance of search results. Given this is such a major source of traffic, I propose that we separate our broader thinking on sources from these pending changes, so that we can launch them in the near term and start to reap the benefits. Are you all amenable to that?

If so, the source list seems to be the only potentially unfinished change. What are you two thinking on the current version? Is it serviceable enough to deploy, perhaps with some Matomo event capturing to help inform our broader discussion of exposing sources? If not, can we get it there by tweaking it? If the answer is still no, I can pull it out into a separate development branch to unblock the rest of these improvements – just let me know where your heads are at, on this and/or more broadly!

tlongers commented 3 years ago

+1 to chat.

hancush commented 3 years ago

I've deployed the changes in #757! Some further updates to the mission log: Told Google about the new site maps and they have registered.

Screen Shot 2021-06-03 at 1 52 26 PM

Also confirmed citation and source list interactions are showing up in Matomo. Here's me testing:

Screen Shot 2021-06-03 at 1 54 45 PM

tlongers commented 3 years ago

Looking good!

I caught this issue with the sources table: #760

tlongers commented 3 years ago

More work needed to close this. Keeping open.