INL / corpus-frontend

BlackLab Frontend, a feature-rich corpus search interface for BlackLab.
16 stars 7 forks source link

Customizing the "Document" page #455

Closed stcoats closed 1 year ago

stcoats commented 1 year ago

Hello again!

I would like to change the Document page, which currently shows the local filename as the title and does not have labels in the table:

image

Do I need to change meta.xsl for this? Or do something in the custom.article.js file?

I am using the default format.yml file for indexing, with this option.

  - forEachPath: meta
    namePath: "@name"
    valuePath: .    

How do I exclude specific metadata fields? For example, I don't need to display the local filepath.

KCMertens commented 1 year ago

The filepath is the default title of a document (because it's one of the only fields that's guaranteed to exist). Change it by overriding it in the blf.yaml and reindexing:

corpusConfig:
  specialFields:
    titleField: some_metadata_field_name_here

By default all fields are shown in the metadata display, but if you define groups, only fields inside those groups will be shown. Note: this will apply everywhere, so it will also influence options for sorting, grouping results, filters, etc. If you need more granular control, you'll have to use the configuration functions in javascript. Here's an example from one of our own corpora.

corpusConfig:
  # How to group fields into tabs
  metadataFieldGroups:
  - name: Common
    fields: 
    - title
    - titleLevel2
    # - ...

  - name: Newspapers
    fields: 
    - settingOrganization
    - settingPerson
    # - ...

  - name: Easy Language
    fields: 
    - subtypeBasilex
    - targetAudience
    # - ...

For the labels not appearing, that's more mysterious. If you can post a snippet of the xml returned by BlackLab that would help, the url will be /blacklab-server/${corpus}/docs/${some_doc}

The quickest fix is probably to just override the displaying of metadata for your corpus.

stcoats commented 1 year ago

Thanks, I will try those approaches. An example doc looks like this:

<blacklabResponse>
<docPid>49</docPid>
<docInfo>
<channel_title>
<value>South Burnett Regional Council</value>
<value>South Burnett Regional Council</value>
</channel_title>
<country>
<value>AUS</value>
</country>
<docId>
<value>oTMLR6i3vyY</value>
</docId>
<latlong>
<value>(-26.539756, 151.843129)</value>
</latlong>
<video_length>
<value>9501.64</value>
</video_length>
<nr_words>
<value>26589.0</value>
</nr_words>
<channel_url>
<value>https://www.youtube.com/channel/UCo2az2nfFmfqYuXyfP_LFDw</value>
</channel_url>
<fromInputFile>
<value>/home/centos/storage/DONOTREMOVE/blacklab/blacklab-core-3.0.1/../tmp/test_QLD_34295.xml</value>
</fromInputFile>
<video_title>
<value>General Council Meeting 20-10-2021</value>
</video_title>
<council_name>
<value>South Burnett Regional Council</value>
</council_name>
<location>
<value>Kingaroy QLD 4610, Australia</value>
</location>
<state>
<value>QLD</value>
</state>
<upload_date>
<value>20211020</value>
</upload_date>
<lengthInTokens>26588</lengthInTokens>
<mayView>false</mayView>
</docInfo>
<metadataFieldGroups/>
<docFields>
<titleField>fromInputFile</titleField>
</docFields>
<metadataFieldDisplayNames>
<channel_title>Channel title</channel_title>
<channel_url>Channel url</channel_url>
<council_name>Council name</council_name>
<country>Country</country>
<docId>Doc id</docId>
<fromInputFile>From input file</fromInputFile>
<latlong>Latlong</latlong>
<location>Location</location>
<nr_words>Nr words</nr_words>
<state>State</state>
<upload_date>Upload date</upload_date>
<video_length>Video length</video_length>
<video_title>Video title</video_title>
</metadataFieldDisplayNames>
</blacklabResponse>
stcoats commented 1 year ago

It seems that my xml input files were non-standard, so the default meta.xsl file did not process them correctly. I rewrote meta.xsl and now everything is displaying correctly.