Open rlucas7 opened 3 days ago
Adding context for:
The BertViz library dumps all the attentions into the data attribute of the html class of the output this is so that it can work on the rendering with the d3 javascript library (I think). It would be nice to have the embeddings read in from the sqlite-vec given a chosen layer it would read in the vectors for only that layer and render the visual.
What happens in the latest version of BertViz (1.4.0) the attentions are written as a substitution to a constant in the *.js
versions of the views. Here is the line for the constant in the head_view
and the line which does the interpolation is here in the head_view.py
module for the html_action=='view'
which is the branch used within a jupyter notebook. To embed into an html outside a jupyter notebook you use the 'return' branch of the function instead and then format appropriately for your templating engine. The jinja module is used inside the flask app so I pass the data from this view through Markup()
to appropriately escape the html tags etc.
Also here is a screenshot from one of the inspect views inside the app to demonstrate what I'm referring to by writing all the attentions into the html tag (inside a <script>
tag element)
There are a couple different libraries investigated, I am choosing
query
from search and a SERPOthers I considered but will hold off on for now are:
SERP = Search Engine Results Page PL = Programming Language
Note here the 'page' is a search/IR vernacular for someblock of text which historically referred to a page from a document (say a book or website) whereas here the 'page' typically refers to a class/method/function from the PL.
There are pros and cons to using each one so I'll just pick one to implement and move forward, perhaps investigating others afterwards to see if they afford anything that is clearly a limitation. Plus I'll be able to contextualize those impacts better once some use has been made of the chosen library.
For now I have some scratch code for BertViz I'm working to include as an inspection tool for a SERP.
I will know this issue is resolved when I can click on a SERP and see a BertVIz of the query and the result.
Doing so enables someone to dynamically understand what tokens in the phrasing of their search query generates the selected response. This view affords the user to dynamically, and iteratively reformulate their query without needing to understand the complicated vernaculars of e.g. lucene style queries.
Limitations:
data
attribute of the html class of the output this is so that it can work on the rendering with the d3 javascript library (I think). It would be nice to have the embeddings read in from the sqlite-vec given a chosen layer it would read in the vectors for only that layer and render the visual.