请问DPT标注数据的时候，标注错了怎么取消？

zhujinqiu commented 2 years ago

我发现用这个工具标注数据，标注或者只要选中一个词就必须打标，打完标更改是可以，可是不嫩删除，请问作者能给出解决方案吗

yangheng95 commented 2 years ago

@lpfy 不知道您有没有时间关注一下这个问题？

lpfy commented 2 years ago

@yangheng95 @zhujinqiu I have made changes and pulled request to merge, should be able to remove wrong tag now, however, please test first.

yangheng95 commented 2 years ago

Thanks @lpfy

yangheng95 commented 2 years ago

Please check if this PR #15 solve your problem @zhujinqiu

zhujinqiu commented 2 years ago

@yangheng95 已经解决了，谢谢大佬

yangheng95 commented 2 years ago

Thanks for your contribution. @lpfy @zhujinqiu

zhujinqiu commented 2 years ago

@yangheng95 @lpfy 又发现了一个bug，比如我选了一对词组（combine）的时候，选错了，然后把他clear掉之后，再想重新标记，只能对combine的词组进行打标（bug），无法再选中单个词进行标注。比如你们文件中的第一个例子：“cute animals. bought annual pass, and we have visited 3x now.”我一开始选中了“cute animals”并进行combine,然后发现标注错了，想clear掉，再去选“animals”进行打标，这个操作就做不了，只能对之前的cute animals进行打标。。。可是这个cute animals我之前已经clear掉了啊，不过后台貌似是“一旦两个词combine，永远不能拆分”。clear在这种情况下没用。

lpfy commented 2 years ago

@zhujinqiu 是的，当时增加clear按钮时，没有写拆分的逻辑。争取明天把这部分补上。

lpfy commented 2 years ago

@zhujinqiu I have added extra logic to the "clear button", it allows the combined term to be split. However, logic only for 2 words terms e.g. "cute animals", NOT ready for 3 words terms e.g. "bought annual pass". Will add extra logic for 3 words terms in the next couple of days.

@yangheng95 I have pulled a request to merge

yangheng95 commented 2 years ago

Thanks very much! @lpfy

yangheng95 commented 2 years ago

I suggest to add version tag to DPT in future, which may helps. e.g., in the page title. @lpfy

lpfy commented 2 years ago

I suggest to add version tag to DPT in future, which may helps. e.g., in the page title. @lpfy

good point, will do this in the next update

lpfy commented 2 years ago

@zhujinqiu v1.05 uploaded, the "clear button" should be able to split 2-words and 3-words terms.

@yangheng95 I have pulled a request to merge, version number is in the title line

zhujinqiu commented 2 years ago

@zhujinqiu v1.05 uploaded, the "clear button" should be able to split 2-words and 3-words terms.

@yangheng95 I have pulled a request to merge, version number is in the title line

感谢大佬。这个工具很方便，希望增加在标注的时候删除整个句子的功能。我的一些数据集有些句子不包含任何情感，能在标注的时候直接删除的话，比保存后再找到相应的句子再删方便很多。

lpfy commented 2 years ago

@zhujinqiu v1.05 uploaded, the "clear button" should be able to split 2-words and 3-words terms. @yangheng95 I have pulled a request to merge, version number is in the title line

感谢大佬。这个工具很方便，希望增加在标注的时候删除整个句子的功能。我的一些数据集有些句子不包含任何情感，能在标注的时候直接删除的话，比保存后再找到相应的句子再删方便很多。

这个功能简单，我争取明天更新一下

linl0030 commented 2 years ago

@lpfy 请问现在这个工具是只能给英文数据标注还不能支持中文吗？我尝试标注中文数据但是好像没有办法分词，默认把整个review作为一个整体标注

lpfy commented 2 years ago

@lpfy 请问现在这个工具是只能给英文数据标注还不能支持中文吗？我尝试标注中文数据但是好像没有办法分词，默认把整个review作为一个整体标注

暂时没办法标注中文，主要是我日常工作是英文环境，没太关注中文分词的plugin，我不知道有啥专业的JavaScript的分词程序可以用。如果有啥好用的opensource的JavaScript分词器，请告知。其他方式是您可以用R的jieba分词提前把句子分好，用空格隔开词组，这样DPT就可以用了

linl0030 commented 2 years ago

@lpfy 请问现在这个工具是只能给英文数据标注还不能支持中文吗？我尝试标注中文数据但是好像没有办法分词，默认把整个review作为一个整体标注

暂时没办法标注中文，主要是我日常工作是英文环境，没太关注中文分词的plugin，我不知道有啥专业的JavaScript的分词程序可以用。如果有啥好用的opensource的JavaScript分词器，请告知。其他方式是您可以用R的jieba分词提前把句子分好，用空格隔开词组，这样DPT就可以用了

我已经用jieba分词并用空格隔开，但是DPT还是无法识别到单个词组，请问您知道原因及解决方案吗？

lpfy commented 2 years ago

请问现在这个工具是只能给英文数据标注还不能支持中文吗？

问题出在正则表达式中\b不能识别中文，您可以用文本编辑器把HTML文件中第452行的 v.split(/\s*\b\s*/) 替换成 v.split(‘ ’) 空格

我要去想想怎么可以一行正则表达式满足中英文

linl0030 commented 2 years ago

请问现在这个工具是只能给英文数据标注还不能支持中文吗？

问题出在正则表达式中\b不能识别中文，您可以用文本编辑器把HTML文件中第452行的 v.split(/\s*\b\s*/) 替换成 v.split(‘ ’) 空格

我要去想想怎么可以一行正则表达式满足中英文

非常感谢，问题已经解决了！

harrywang commented 2 years ago

请问现在这个工具是只能给英文数据标注还不能支持中文吗？

问题出在正则表达式中\b不能识别中文，您可以用文本编辑器把HTML文件中第452行的 v.split(/\s*\b\s*/) 替换成 v.split(‘ ’) 空格

我要去想想怎么可以一行正则表达式满足中英文

多谢！现在是 518行了：

let wtoken = _.map(this.RawABSAData, (v) => v.split(/\s*\b\s*/));

改成

let wtoken = _.map(this.RawABSAData, (v) => v.split(' ')); // support chinese separated by space

yangheng95 / ABSADatasets

请问DPT标注数据的时候，标注错了怎么取消？ #14