What steps will reproduce the problem?
1. use Hungarian foma's result hunfnnum.fst, Hungarian foma downloadable from
https://gitorious.org/hunmorph-foma/hunmorph-foma/trees/master
2. do_testup.sh (attached), watch flookup size using ps, it is 39.6 MB
3. perl x.pl <x >x1 (both attached)
4. perl /home/en/program/foma/tktest/szokincsteszt/szeged/chkwdlistup.pl x1 >
x2
this will take about 5-10 minutes, after the test flookup size increases to > 54 MB
(chkwdlistup.pl attached)
What is the expected output? What do you see instead?
The problem is, that apply.c
----------------------------------------------------------------
int apply_check_flag(struct apply_handle *h, int type, char *name, char *value)
{
struct flag_list *flist, *flist2;
for (flist = h->flag_list; flist != NULL; flist = flist->next) {
if (strcmp(flist->name, name) == 0) {
break;
}
}
h->oldflagvalue = flist->value;
h->oldflagneg = flist->neg;
if (type == FLAG_UNIFY) {
if (flist->value == NULL) {
flist->value = xxstrdup(value); /* this causes the hog */
return SUCCEED;
}
--------------------------------------------------
duplicates a string, and never ever frees it. I found a solution, that fixes
the problem:
in flookup.c:
at the declarations:
extern void apply_clean();
extern void apply_clean_start();
.....
void handle_line(char *s) {
char *result, *tempstr;
apply_clean_start();
....
}
apply_clean();
}
In apply.c:
In declarations:
static int apply_clean_variable;
#define MAX_SAVED 10
static struct flag_list *saved_flag_list[MAX_SAVED];
static char *saved_values[MAX_SAVED];
static int clean_ix;
static void apply_add_clean_list(struct flag_list *flist, char *value);
void apply_clean();
void apply_clean_start();
....
int apply_check_flag(struct apply_handle *h, int type, char *name, char *value)
{
struct flag_list *flist, *flist2;
for (flist = h->flag_list; flist != NULL; flist = flist->next) {
if (strcmp(flist->name, name) == 0) {
break;
}
}
h->oldflagvalue = flist->value;
h->oldflagneg = flist->neg;
if (type == FLAG_UNIFY) {
if (flist->value == NULL) {
flist->value = xxstrdup(value);
apply_add_clean_list(flist, flist->value);
return SUCCEED;
}
....
void apply_clean_start()
{
apply_clean_variable = 1;
}
void apply_add_clean_list(struct flag_list *flist, char *value)
{
if(apply_clean_variable){
saved_flag_list[clean_ix] = flist;
saved_values[clean_ix] = value;
if(++clean_ix >= MAX_SAVED){
clean_ix = 0;
}
}
}
void apply_clean(){
if(apply_clean_variable){
int i;
for(i = 0; i < clean_ix; i++){
xxfree(saved_values[i]);
saved_flag_list[i]->value = NULL;
saved_values[i] = NULL;
saved_flag_list[i] = NULL;
}
clean_ix = 0;
apply_clean_variable = 0;
}
}
void apply_clean_start();
void apply_add_clean_list(struct flag_list *flist, char *value);
void apply_clean();
What version of the product are you using? On what operating system?
Newest from svn, linux debian
Please provide any additional information below.
The solution's description:
1. sign that we are enter flookup,
set apply_clean_variable = 1;
2. If apply_clean_variable == 1, remember all strdups in a list, max 10 of them
3. when leaving flookup, free all strings in list, put NULL into their pointer
in struct flag_list *flist, set apply_clean_variable = 0;
----------------------------------
I have also tried to eliminate the strdup in apply_check_flag, and pass back a
FAIL, however, in that case lots of words were not found, that is functionality
of foma fails.
Original issue reported on code.google.com by eleonor...@gmx.net on 16 Jan 2013 at 2:26
Original issue reported on code.google.com by
eleonor...@gmx.net
on 16 Jan 2013 at 2:26Attachments: