issues
search
BU-Spark
/
ml-herbarium
Herbaria ML
15
stars
12
forks
source link
ML-Herbarium: New Pipeline
#82
Closed
eamonniknafs
closed
1 year ago
eamonniknafs
commented
1 year ago
Adds new pipeline that implements the following features:
Tesseract OCR
Corpus generation for real-world use
Structural pattern matching to match fuzzy matched OCR outputs to corpus
Synonym recognition
Automated documentation
Multithreading
Preprocessing
Improved segmentation
Improved scraping
Adds new pipeline that implements the following features: