Add a Metadata object to hold some persistent user-configuration data such as the model to use when performing a classification and where to find it.
Add two fields to the ScrapedData scheme:
label : str = Classification label, needed to check if a row was already labeled and which label it was assigned.
source : str = Where did this data came from, useful to get a hint of how thrustful a label is. For example, a label assigned by some human will be way more trustworthy than a one created by a trained classifier.
Solution
Add such fields to the ScrapedData class, requiring us to update the PersistencyManager implementation, and some functions in the CLI tool.
Add the Metadata class, which is json serializable in order to easily save it and parse it from a local file. Also, update the manager to add it to its constructor
Additional changes
Added a command for browsing local-stored experiments: c4v experiment ls, c4v experiment summary
Added bulk classification to microscope.Manager object and Classifier object
Added a command to classify and label rows without label c4v classify <experiment_name> pending
Relevant files
src/c4v/scraper/scraped_data_classes/scraped_data.py = File with the ScrapedData scheme
src/c4v/scraper/persistency_manager/sqlite_persistency_manager.py = The PersistencyManager implementation that required to be updated with the scheme changes
src/c4v/microscope/metadata.py = Metadata class
src/c4v/microscope/manager.py = Manager object to be updated to add the Metadata and the bulk classification function
Problem
We need two things:
Metadata
object to hold some persistent user-configuration data such as the model to use when performing a classification and where to find it.ScrapedData
scheme:label : str
= Classification label, needed to check if a row was already labeled and which label it was assigned.source : str
= Where did this data came from, useful to get a hint of how thrustful a label is. For example, a label assigned by some human will be way more trustworthy than a one created by a trained classifier.Solution
ScrapedData
class, requiring us to update thePersistencyManager
implementation, and some functions in the CLI tool.Metadata
class, which is json serializable in order to easily save it and parse it from a local file. Also, update the manager to add it to its constructorAdditional changes
c4v experiment ls
,c4v experiment summary
microscope.Manager
object andClassifier
objectc4v classify <experiment_name> pending
Relevant files
src/c4v/scraper/scraped_data_classes/scraped_data.py
= File with theScrapedData
schemesrc/c4v/scraper/persistency_manager/sqlite_persistency_manager.py
= ThePersistencyManager
implementation that required to be updated with the scheme changessrc/c4v/microscope/metadata.py
= Metadata classsrc/c4v/microscope/manager.py
= Manager object to be updated to add theMetadata
and the bulk classification functionsrc/c4v/c4v_cli.py
= New commands added