I am doing a project to detect written Pashto offensive language through "Azure Language Studio". In Pakistan and Afghanistan, people write Pashto in English alphabets on chat apps and social media apps, for communication. And I want to detect such Pashto offensive language which is written in English alphabets, I mean that the semantics of the content are in Pashto but the syntax of the content is in English. Let me explain this through an example:
If I say "How are you" in English then in Pashto, it is written as: تاسو څنګه یئ, but most people in Pakistan and Afghanistan, will write it English alphabets as: "Taso sanga ye". Now this sentence("Taso sanga ye") seems that it is something written in English but actually only that person can understand it who knows Pashto. So, I want to detect Pashto offensive language which is written in English alphabets by using Azure Language studio for model training, in Azure Language studio there is an option for the Pashto language but for that, I need to have the Pashto language written in this syntax(تاسو څنګه یئ) but I need a model which can detect Pashto offensive language written in English alphabets. So, there is no option in Language studio that can do this task.
To Reproduce
Steps to reproduce the behavior:
Go to 'Azure Language studio' after creating resources in Azure and a custom text classification project in Language studio.
For creating a project in Language studio we need to select language but there is no such option that can be used for Pashto language (written in English alphabets).
Expected behavior
I need an option for a custom text classification project to work with the Pashto language which is written in English alphabets.
Additional context
If the reader can't understand this problem then I can have a meeting with Microsoft to further clarify this issue.
Thank you for submitting this issue! The team will review your issue, tag with the appropriate tags, and comment with any additional questions on information needed. :sparkles:
Azure Langauge Studio for the Pashto Language
I am doing a project to detect written Pashto offensive language through "Azure Language Studio". In Pakistan and Afghanistan, people write Pashto in English alphabets on chat apps and social media apps, for communication. And I want to detect such Pashto offensive language which is written in English alphabets, I mean that the semantics of the content are in Pashto but the syntax of the content is in English. Let me explain this through an example: If I say "How are you" in English then in Pashto, it is written as: تاسو څنګه یئ, but most people in Pakistan and Afghanistan, will write it English alphabets as: "Taso sanga ye". Now this sentence("Taso sanga ye") seems that it is something written in English but actually only that person can understand it who knows Pashto. So, I want to detect Pashto offensive language which is written in English alphabets by using Azure Language studio for model training, in Azure Language studio there is an option for the Pashto language but for that, I need to have the Pashto language written in this syntax(تاسو څنګه یئ) but I need a model which can detect Pashto offensive language written in English alphabets. So, there is no option in Language studio that can do this task.
To Reproduce
Steps to reproduce the behavior:
Expected behavior
I need an option for a custom text classification project to work with the Pashto language which is written in English alphabets.
Additional context
If the reader can't understand this problem then I can have a meeting with Microsoft to further clarify this issue.
🎓 Student Ambassador