Clarifications on annotation

Hi there, I am working on a building a new dataset in Spanish (polysyllabic language). I have gone though MakeDiffSinger but I still have some gaps. I would be grateful if you could sanity check me on my understanding and share any thoughts you might have

Questions for clarifications:

_phseq: These are sequences of phonemes or syllables? Currently I using phonemes and their timestamps as provided by MFA. I am using a pre-trained Spanish model available by MFA. Would you recommend training a new one on my specific data?
_notedur: The midi notes should be estimated over phonemes, syllables, or words? Now I estimated one note for each phoneme and assumed ph_dur==note_dure
_phnum: The number of phonemes in each word or in each syllable? Now I assumed the number of phonemes in each word
_noteseq: Do you think SOME would suffice to get a first shot at this ? I would speculate yes?
_isslur: how would you define slur in this context? I have not found plenty of resources on this topic Now I assumed no slurs at all
SPs and APs: Would you recommend doing that manually or using the enhance script might be OK for a first shot?

Thanks!

openvpi / DiffSinger

Clarifications on annotation #211