microsoft / Recognizers-Text

Microsoft.Recognizers.Text provides recognition and resolution of numbers, units, date/time, etc. in multiple languages (ZH, EN, FR, ES, PT, DE, IT, TR, HI, NL. Partial support for JA, KO, AR, SV). Packages available at: https://www.nuget.org/profiles/Recognizers.Text, https://www.npmjs.com/~recognizers.text
MIT License
1.68k stars 429 forks source link

[JA DateTime] How to suggest spec bug fix? #1771

Closed combacsa closed 5 years ago

combacsa commented 5 years ago

Is your feature request related to a problem? Please describe. I'm glad that this project has excellent translated Japanese DateTime Spec even though Patterns have some Chinese-only phrases and grammar which is clearly not Japanese ones...

However, I found some bugs on spec. Let's take a look at an example on DatePeriodParser.json:

  {
    "Input": "10月2日から10月22日まで不在にします。",
    "Context": {
      "ReferenceDateTime": "2016-11-07T00:00:00"
    },
    "NotSupported": "dotnet",
    "NotSupportedByDesign": "javascript,python,java",
    "Results": [
      {
        "Text": "10月2日から10月22日まで",
        "Type": "daterange",
        "Value": {
          "Timex": "(XXXX-10-02,XXXX-10-22,P20D)",
          "FutureResolution": {
            "startDate": "2017-10-02",
            "endDate": "2017-10-22"
          },
          "PastResolution": {
            "startDate": "2016-10-02",
            "endDate": "2016-10-22"
          }
        },
        "Start": 12,
        "Length": 25
      }
    ]
  },

Extracted string doesn't begin at index 12, it's from the beginning. And length is 15, not 22. This kind of bugs are easily fixable, I believe. A bit tough example is something like this:

  {
    "Input": "私は1日の日曜日に戻ります",
    "Context": {
      "ReferenceDateTime": "2017-09-27T00:00:00"
    },
    "NotSupportedByDesign": "dotNet, javascript, python, java",
    "Results": [
      {
        "Text": "1日の日曜日",
        "Type": "date",
        "Value": {
          "Timex": "2017-09-03",
          "FutureResolution": {
            "date": "2017-09-03"
          },
          "PastResolution": {
            "date": "2017-09-03"
          }
        },
        "Start": 2,
        "Length": 6
      }
    ]
  },

Here we can see "1日の日曜日" but let's see another example on same spec file:

  {
    "Input": "7月の第1金曜日に戻ります。",
    "Context": {
      "ReferenceDateTime": "2016-11-07T00:00:00"
    },
    "NotSupported": "dotnet",
    "NotSupportedByDesign": "javascript,python,java",
    "Results": [
      {
        "Text": "7月の第1金曜日",
        "Type": "date",
        "Value": {
          "Timex": "XXXX-07-WXX-5-#1",
          "FutureResolution": {
            "date": "2017-07-07"
          },
          "PastResolution": {
            "date": "2016-07-01"
          }
        },
        "Start": 13,
        "Length": 24
      }
    ]
  }

Way of describing 'nth weekday' in Japanese form is later one, not previous one. This kind of grammatical error on Spec needs to be fixed, I believe.

Toughest ones might be something like this:

  {
    "Input": "私は2016年04月21日の午後8:00に戻ります",
    "Context": {
      "ReferenceDateTime": "2016-11-07T00:00:00"
    },
    "NotSupportedByDesign": "dotnet,javascript,python,java",
    "Results": [
      {
        "Text": "04/21/2016, 8:00pm",
        "Type": "datetime",
        "Value": {
          "Timex": "2016-04-21T20:00",
          "FutureResolution": {
            "dateTime": "2016-04-21 20:00:00"
          },
          "PastResolution": {
            "dateTime": "2016-04-21 20:00:00"
          }
        },
        "Start": 2,
        "Length": 18
      }
    ]
  },

Extracted string should be 2016年04月21日の午後8:00. Input text is well translated. But, unfortunately, result text part isn't translated.

Describe the solution you'd like I personally made some patches to fix these bugs, but since I'm a Python user and there isn't Python Japanese DateTime model available yet, I'm not sure the way to validate my patch.

Even worse, I only have a Ubuntu machine, so it's not easy for me to figure out how to run some tests on other Programming languages, such as DotNet. (I found that there is some problem running DotNet package of this project on Ubuntu, some kind of version problem.)

So I need a guide how to create a pull request with ensured quality check.

Describe alternatives you've considered Or someone other than me could fix those bugs in Spec and make proper full request... I made a dirty script fix_spec_error.zip to automatically fix some trivial Spec error. Of course this lacks with some other spec bug fix.

Additional context Of course, it would be very appreciated if someone would guide me how to make pull request by my own. Of course I've already read Contributing guide but it only tells about JavaScript and DotNet, not about other Programming language, and I'm keep failing to run DotNet ones on my local machine...

Thanks for reading.

tellarin commented 5 years ago

Unfortunately Japanese datetime is not complete in any platforms, so there's no way to validate spec changes. But we would be very happy to review your changes and merge them in. Could you create a PR just changing the relevant specs?

combacsa commented 5 years ago

@tellarin Of course. I'll create one within 48 hours :)

combacsa commented 5 years ago

Oops. 48 hours become more than 72 hours.

tellarin commented 5 years ago

No worries, @combacsa. Thanks a lot!