project-chalam / data

0 stars 0 forks source link

Cleanup resources #1

Open ChillarAnand opened 7 years ago

ChillarAnand commented 7 years ago
Andhra Mahabarathamu-adi Parvamu
Andhra Mahabarathamu-adi Parvamu
Andhra Mahabarathamu-adi Parvmu
Andhra Mahabarathamu'adi Parvamu
udaybhaskar-v commented 7 years ago

i can remove duplicates based on typos.But how to remove only unwanted numbers(Confused on how to decide on unwanted numbers).I Can remove all the numbers in both title and author fields.

ChillarAnand commented 7 years ago

DLI has unwanted numbers in the lead. Perhaps we can remove them.

10030 Niiti Bhodha
10035 Shrii Mahabhaagavatamu
10032 Shrii Bhaarataarnd-avalagha Bhoodhini

We should leave trailing numbers as it is.

Kalyani vol 1
Kalyani vol 2

Also, there are unwanted quotation marks in title and author fields. They also should be removed.

An'daaruu Okka In't'ivaarei
An'kagand-itamu 1
An'kitamu