Closed hunkim closed 8 years ago
There are some source code. I think these are conuted as non-english characters.
Do you want if some files does not have any korean (or other translated langauge) should have 0?
I think we should show the progress of translation. It should be the rate between English and Non-English words. I think we should exclude source code and other taggings when we compute the rate.
"should have 0?" Yes, so that people clearly know what they should work on. :-)
def test_trans_coverage_file_source_code(self):
e_count, n_count = main.trans_coverage_file("tests/sample_source_code.md")
print("sample_source_code: ", e_count, n_count)
self.assertEqual(e_count, 0)
self.assertNotEqual(n_count, 0)
I guess self.assertNotEqual(n_count, 0)
should be self.assertEqual(n_count, 0)
.
Yes. hmm.. so the source code (or others not normal text) should be ignored completely (does not counted at all)?
If there is complete source code, we may mark this as 0(%). right?
Sure.
Sung
On Mon, Oct 17, 2016 at 3:09 PM, ming notifications@github.com wrote:
If there is complete source code, we may mark this as 0(%). right?
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/hunkim/translation_coverage/issues/22#issuecomment-254129162, or mute the thread https://github.com/notifications/unsubscribe-auth/AA3DV8TKjqC1LoRCIWZBq8o8-5qeQa49ks5q0x8_gaJpZM4KYSDb .
So, how about this?
Is it complex? or any ideas?
Let's keep it simple. Just exclude source code in the equation.
Ok, I'll just exclude source code not counting length of that.
@mingrammer Could you check our examples?
https://github.com/tensorflowkorea/tensorflow-kr/blob/master/progress.md
For example, word2vec should be 0, but it's not the case.