Closed SaulLu closed 2 years ago
@thomasw21
I'm confused is this new code?
Yes, it's only new code because I needed to adapt it to our dataset (to our custom column names and the fact that all our examples haven't necessarily a value in html_str
column. :slightly_smiling_face:
I did copy and paste sections of code from the metadata WG but they are modified to work in our case. The only snippet of code that is maybe not necessary is the one for ErrorWrapperPreprocessor
but I didn't try to simply import it from bsmetadata
.
@HugoLaurencon I don't know if you have a script that actually merges PRs. I don't think we should have merged this (unless you reviewed it?)
This PR add the python and SLURM scripts to extract the text, the
metadata_HTML
, thehtml_header
, thehtml_footer
and thehtml_title
in new columns.cc @thomasw21