isawnyu / isaw.web

Isaw website buildout
http://isaw.nyu.edu
1 stars 3 forks source link

Dump a list of all URIs in links in the database #535

Open paregorios opened 1 month ago

paregorios commented 1 month ago

In order to perform various content review and improvement activities, we need to be able to dump a list of all URIs in content in the database. This includes all URIs in fields designated for the purpose (e.g., in "Link" content items), as well as all URIs found in href attribute values on a elements in HTML in any field. Ideal output format would be CSV, with columns for the URI, context path, field name, HTML a text value (if any). A command-line script run on the server would be a fine solution for this, as only admins will need to be able to perform this function.