issues
search
bitextor
/
bifixer
Tool to fix bitexts and tag near-duplicates for removal
GNU General Public License v3.0
29
stars
3
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
Migrate to pyproject.toml and src/ structure
#20
ZJaume
closed
1 year ago
0
Output file encoding should be set to UTF-8
#19
rxzhangGH
closed
1 year ago
2
Monofixer adds empty extra lines when there are multiple columns
#18
cgr71ii
closed
1 year ago
1
Long sentences are not being removed apparently
#17
cgr71ii
closed
1 year ago
2
Detokenization introduce new error to the data and ignore_detokenization is not working
#16
jgcb00
closed
2 years ago
3
'charmap' codec can't decode input_test_2.txt
#15
b3ade
closed
2 years ago
9
Conda build
#14
cgr71ii
closed
2 years ago
0
Tests don't pass
#13
cgr71ii
closed
2 years ago
5
Add sentence enumeration to paragraph identification when sentences are split
#12
cgr71ii
closed
2 years ago
6
Add headers to input and output files
#11
cgr71ii
closed
2 years ago
2
Upgrade to 0.5
#10
ZJaume
closed
3 years ago
0
Bifixer installation with pip
#9
zuny26
closed
3 years ago
2
Relax requirements conditions
#8
ZJaume
closed
3 years ago
0
Bifixer Indexerror: list index out of range
#7
jokinlasa
closed
3 years ago
4
Added new options for deferred crawling standoff annotation
#6
lpla
closed
3 years ago
0
Bifixer doesn't work with new ftfy >=6.0
#5
lpla
closed
3 years ago
2
	 introduces tabs in tsv output
#4
jelmervdl
closed
3 years ago
1
Obtain a clean output
#3
jgcb00
closed
3 years ago
2
Bifixer doesn't see input file
#2
Syrkovski
closed
4 years ago
2
Segmenter tests
#1
mbanon
closed
5 years ago
0