direct-phonology / jdsw

Parsing the "Jingdian Shiwen" with spaCy
MIT License
2 stars 0 forks source link

Add an algorithm for inferring relations between spans #61

Open thatbudakguy opened 11 months ago

thatbudakguy commented 11 months ago

We can implement this as a finite-state transducer.

thatbudakguy commented 11 months ago

transition table

(Last span) PER or WORK PHON, SEM or GRAF META 本,一本,etc. other
[nothing] Add to the source stack Add to the claims stack (and add JDSW to source stack?) Warn and continue Warn and continue Warn and continue Add to the source stack Continue
PER or WORK Add to the source stack Add to the claims stack Warn and continue Emit a SRC between the previous two spans? Continue If it was PER, treat as a compound work Clear the source and claim stacks
PHON, SEM or GRAF Emit a SRC relation tree Add to the claims stack Emit a MOD relation Continue (or merge to last?) Warn and continue Emit a SRC relation tree Emit a SRC relation tree
META Add to the source stack Warn and continue ??? Warn and continue Warn and continue Add to the source stack Clear the source and claim stacks
本,一本,etc. Warn and continue Add to the claims stack Warn and continue Emit a src between the previous two spans? Continue Warn and continue Continue
other Add to the source stack Warn and continue Warn and continue Warn and continue Warn and continue Add to the source stack Continue