vikasdummy / kashmiriDictionaryIssueTracker

Just to track issues, no code here.
0 stars 0 forks source link

streamline data #3

Closed vikasdummy closed 10 years ago

vikasdummy commented 10 years ago

make presentation better

vikasdummy commented 10 years ago

there are ~24000 words. Each single word will need manual pruning and realignment alignment. This step can not be automated , reasons: -- many variety and variations. -- Subjective decisions at some places.

Assuming each word take 1 - 2 minutes to finalize. It is estimated work of 8-16 Hours in continuity. Can take about 4 to 5 Human days.

Anyone willing to do is welcomed. Requirements: -- SQlite DB browser. -- Common sense. -- Some knowledge of Kashmiri.

dnemesisd commented 10 years ago

What is subjective? How about we take whatever there is in a string? Remove all line breaks and then display it that way..

dnemesisd commented 10 years ago

Replace line breaks with "Space"

vikasdummy commented 10 years ago

Line break is not the issue. Thing is other than the meaning of the word , there is more data, about the references to some books and page numbers.Like (Siv. 1212, 1212,12312). or (EL. 121 , 12121)

There are other junk strings also in meaning sometimes. So each word needs proof reading by human eye and verification. Subjective as in , at proof reading time the proof reader will decide as per his best judgement what to keep what to remove. Pronunciation skills and good handling of language is required to do it at one word per minute speed to accomplish it by next Saturday IST.

dnemesisd commented 10 years ago

okay got it!

vikasdummy commented 10 years ago

Link for abbreviations Abbreviations must be understood. http://dsalsrv02.uchicago.edu/cgi-bin/philologic/getobject.pl?c.0:1:2.grierson

vikasdummy commented 10 years ago

ab. = above.

abbr. = abbreviated.

abl. = ablative.

abs. = abstract.

acc. = accusative.

act. = active.

adj. = adjective.

adv. = adverb.

aff. = affix.

ag. = case of the agent.

agric. = agricultural.

an. = animate.

anon. = anonymous.

art. = article.

auxil. = auxiliary.

bel. = below.

ben. = benedictive mood.

B.Gr. = Burkhard, Das Verbum, die Nomina, und die Präpositionen derKâçmîrîsprache; the translation by G.A. Grierson, reprinted from the Indian Antiquary, is the edition quoted.

card. = cardinal numeral.

caus. = causal.

cf. = confer, compare.

c.g. or com. gen. = common gender.

col. a = left-hand column of a page.

col. b = right-hand column of a page.

coll. = colloquial.

com. = commonly.

comm. = commentary.

comp. = compound.

compar. = comparative degree.

comp. p.p. = compound past participle.

con. = concrete.

cond. = conditional.

conj. = conjugation.

conj. part. = conjunctive participle.

conjnct. = conjunction.

cons. = consonant.

constr. = construction.

cont. = contemptuous.

contr. = contracted or contraction.

cor. = corrupt.

corr. = correct.

correl. = correlative or correlative pronoun.

D. = Drew, Jummoo and Kashmir Territories.

dat. = dative.

decl. = declension.

defect. = defective.

dem. = demonstrative pronoun.

den. = denominative.

der. = derivation or derivative.

dim. = diminutive.

dir. = direct.

dur. = durative.

e.g. = exempli gratia, for example.

El. = Elmslie, Kashmírí Vocabulary.

emph. = emphatic.

esp. = especial.

etym. = etymology.

euph. = euphonic.

exam. = example.

exc. = except or exception.

f. or fem. = feminine.

fac. = facetious.

fig. = figurative.

fr. = from.

freq. = frequentative.

fut. = future.

fut. p.p. = future passive participle.

gen. = genitive.

gend. = gender.

genl. = general.

geog. = geographical.

gram. = grammatical.

Gr.Gr. = Grierson, Essays on Kāçmīrī Grammar.

Gr.M. = Grierson, Kāshmīrī Manual.

ib. = ibidem, in the same place as the preceding.

id. = idem, the same meaning as that of the preceding word.

impers. = impersonal.

impf. = imperfect tense.

impve. = imperative mood.

inan. = inanimate.

incorr. = incorrect.

ind. = indicative mood.

indcl. = indeclinable.

indef. = indefinite.

inf. = infinitive.

instr. = instrumental.

intens. = intensitive.

inter. = interrogative or interrogative pronoun.

interj. = interjection.

intr. = intransitive.

introd. = introduction.

i.q. = id quod, the same as.

irr. = irregular.

K.Pr. = Knowles, Dictionary of Kashmiri Proverbs.

L. = Lawrence, The Valley of Kashmir.

l. = line.

lit. = literally.

loc. = locative.

m. or masc. = masculine.

m.c. = metri causa, for the sake of metre.

med. = medical.

met. = metaphorical.

meton. = metonymical.

myth. = mythological.

N. = name.

n. or neut. = neuter.

n.ag. = nomen agentis, noun of agency.

neg. = negative.

nom. = nominative.

num. = numeral.

obj. = object.

obl. = oblique.

obs. = obsolete.

obsc. = sensu obscœno.

onomat. = onomatopoetic.

opp. to = opposed to.

ord. = ordinal numeral.

orig. = original.

p. = page.

part. = participle.

pass. = passive.

past = past tense.

1 past = first past tense, and so on.

perf. = perfect.

pers. = person.

phon. = phonetic.

phr. = phrase.

pl. or plur. = plural.

pleon. = pleonastic.

plup. = pluperfect.

poet. = poetical.

pol. = polite.

postpos. = postposition.

p.p. = past participle.

1 p.p. = first past participle, and so on.

pphr. = periphrastic.

prec. = precative.

pref. = prefix.

prep. = preposition.

pres. = present.

pres.-fut. = present-future.

prim. = primary.

priv. = privative.

prob. = probably.

pron. = pronoun or pronominal.

prop. = properly.

prov. = proverb.

pt. = particle.

qual. = quality or qualitative.

quant. = quantity or quantitative.

q.v. = quod vide, which see.

red. = redundant.

redupl. = reduplication or reduplicated.

refl. = reflexive.

reg. = regular.

resp. = respective.

RT. = Rāja - Taraṅgiṇī, ed. Stein.

RT.Tr. = Translation of RājaTaraṅgiṇī by Stein. The books of the poem are quoted in small roman numerals; thus, i, ii, iii. The volumes are quoted in large roman numerals; thus, I, II.

scl. = scilicet, to be understood.

sec. = secondary.

sen. = sentence.

sg. or sing. = singular.

Śiv. = Śiva-pariṇaya of Kṛṣṇa Rāzdān.

st. = stem.

subj. = subjunctive.

subst. = substantive.

suff. = suffix.

superl. = superlative degree.

s.v. = sub voce, under the word.

tech. = technical.

term. = termination.

tr. = transitive.

transl. = translated or translation.

unphon. = unphonetic.

u.w. = used with.

v. = vide, see.

vb. = verb.

vb. intr. = intransitive verb.

vb.n. = verbal noun.

vb. suff. = verbal suffix.

vb. tr. = transitive verb.

vill. = used in villages, rural.

voc. = vocative.

vr.l. = varia lectio, different reading.

vs. = verse.

vulg. = vulgar.

W. = Wade, Kashmīrī Grammar.

wom. = used by women.

YZ. = Kāshmīrī version of Yūsuf and Zulaixā, ed. Burkhard.

-- indicates that the leading word is to be repeated, but as an independent word, and not as the first member of a compound.

-˚ at the end of a compound.

˚- at the beginning of a compound.

<2> with or without. & and. &c. et cetera, and so forth. √ root.
vikasdummy commented 10 years ago

http://www.grokkingandroid.com/android-quick-tip-formatting-text-with-html-fromhtml/

vikasdummy commented 10 years ago

assigned to Omesh, Deepak , Arun 2200 words to pune.

vikasdummy commented 10 years ago

Will not be removing abbrevations. abbrevation info provided. need to see how to enable tool tip fr usr.bug filed. spellings are fine now