Open ashbeats opened 11 months ago
Hi there- Interesting topic; I'll post to my Twitter. Perhaps NUS can announce a bug bounty and have some engineers take a look.
In the past what has helped is the following:
I'm sure someone can crack this problem with sufficient effort and motivation. Thanks
@ashbeats I can explore on this. Share the font ttf file and few sample documents.
HTML / JS script should help. Please try.
Hi,
Thank you for responding.
The documents just been restored to the original website: https://kanian.com
And another archived site, holds a bit more information: https://ccat.sas.upenn.edu/plc/tamilweb/
The fonts are available for download here: https://ccat.sas.upenn.edu/plc/tamilweb/download.html
and GPT4 had this to add...
... The TAMNET.ttf font is based on the TAM encoding system, which stands for Tamilnet. It was developed by Mr. Naa Govindasamy, an expert in Tamil encoding, and was released in 1995 by the Institute of Research in Digital Units (IRDU) in Singapore.
TAMNET.ttf is a TrueType font that uses a unique encoding scheme to represent Tamil characters. It deviates from the traditional Tamil encoding systems like TSCII (Tamil Standard Code for Information Interchange) or TAM (Tamil Monolingual Keyboard). Instead, it introduces a new layout that is optimized for ease of use and compatibility with the ASCII character set.
In TAMNET.ttf, the Tamil characters are mapped to the traditional QWERTY keyboard layout, where each key represents one Tamil character. For example, pressing the 'a' key outputs the Tamil character 'அ', 'b' outputs 'ப', 'c' outputs 'ச', and so on. This layout made it convenient for users familiar with the English keyboard layout to type Tamil characters without the need for any additional hardware or input methods.
TAMNET.ttf gained popularity during the late 1990s and early 2000s as it provided an easy-to-use encoding system ...
Found charecter map of tamilnet.ttf file here https://fontsdata.com/76760/tamilnet.htm
Exploring on that how to use that table for unicode conversion.
thanks folks; @tshrinivasan - if you find a fix please post a PR to open-tamil also
Working with udhayam.in udhayan to get the mapping for this font.
Will update here on the progress soon.
@ashbeats - do you still need this feature ? did you make any progress ?
@arcturusannamalai I do, but the project is on hold.
Hi,
My name is John. And I have been attempting to convert a few thousand documents and articles that were written in the TamNet.ttf bilingual font, that was released in 1995. The original authors are no longer around, and I have been attempting to find information about the keyboard mapping to write a converter or find an existing converter it to Unicode standard encodings.
Do you know of a converter? I tried your Open-Tamil lib, but it failed to recognise the text or the conversions were not fully accurate.
I understand that the encoding is also questionable as the documents were moved between various formats over the years, such as ansi. So I have preserved them from the originals, and have been inspecting it in binary and comparing it to the same text's written in the TamNet99 formats, and the Murasu formats.
The closest seems to be TamNet99, from google searches and papers, however, there may be edge cases that may elude me.
And insights or direction would be most appreciated.
Best Regards, John