Closed ddeclerck closed 1 year ago
That would be possible in general, but why does GCSORT need that? Couldn't it just set the appropriate flags?
In general those tables should not be exported but a pointer to it be resolved by a function, please adjust the PR for that.
That would be possible in general, but why does GCSORT need that? Couldn't it just set the appropriate flags?
GCSORT is a C program that directly use libcob
, it does not call cobc
hence can not set "appropriate flags".
In general those tables should not be exported but a pointer to it be resolved by a function, please adjust the PR for that.
Indeed, I took a shortcut. The API will be cleaner with a function.
GCSORT is a C program that directly use
libcob
, it does not callcobc
hence can not set "appropriate flags".
I've thought that all options were passed as flags, but the collation tables are indeed passed as pointers, so those need to be resolved up-front.
This is also matching the FCD structure with its _colPtr
/ FCD-COL-SEQ-ADDRESS
fields.
So yes: a way to resolve those is needed by a new API function - the biggest question: should this really use a string (like the decipher function) or not, using only the enum then (which would be placed in common.h, otherwise in coblocal.h).
Note: there was one thing left open for the function in general: a way to provide the name of an input file (looked up in COB_CONFIG_DIR
/ by full name) to freely define the collation (for example as hex numbers. optional space separated with a newline then the ebcdic version); if we want that it seems reasonable to define a function maybe something like int cob_get_collation (char *name /* either pre-defined one or a filename */, unsigned char **ebcdic_col, **ascii_col /* the two later may be NULL in which case there is only a heck for validity */ )
. The int may be only 0 for "success" and an errno value as return, or have -1 as error (with ERRNO set) and an internal index of the resolved collation (which could then passed later for int cob_get_indexed_collation (const int col_index /* must be resolved before, may be different in another run-time unit */, unsigned char **ebcdic_col, **ascii_col)
.
Thoughts?
Side note:
Additional there is an only "on/off" flag in confFlags2
/ FCD-CONFIG-FLAGS
-> fcd--file-is-ebcdic
, but I haven't found a documented note about how this would be used.
So yes: a way to resolve those is needed by a new API function - the biggest question: should this really use a string (like the decipher function) or not, using only the enum then (which would be placed in common.h, otherwise in coblocal.h).
Note: there was one thing left open for the function in general: a way to provide the name of an input file (looked up in
COB_CONFIG_DIR
/ by full name) to freely define the collation (for example as hex numbers. optional space separated with a newline then the ebcdic version); if we want that it seems reasonable to define a function maybe something likeint cob_get_collation (char *name /* either pre-defined one or a filename */, unsigned char **ebcdic_col, **ascii_col /* the two later may be NULL in which case there is only a heck for validity */ )
. The int may be only 0 for "success" and an errno value as return, or have -1 as error (with ERRNO set) and an internal index of the resolved collation (which could then passed later forint cob_get_indexed_collation (const int col_index /* must be resolved before, may be different in another run-time unit */, unsigned char **ebcdic_col, **ascii_col)
.Thoughts?
Actually, we could even move the current hard-coded collating sequences to external files. However we have to determine how this should interact with the current mechanisms. The new option introduced by Nicolas (-febcdic-table
) is meant to configure the EBCDIC collating sequence only, but one might also want to specify an ASCII collating sequence (for our friends not using latin-1 or latin-9).
Actually, we could even move the current hard-coded collating sequences to external files
That would mean to parse those files on each call of cobc - on first use.... maybe only keep the "standard" definition in and tall others as file?
The most important part here is that this would have to be done in the next 24 hours or wait for "quite some time later".... In any case I suggest to add some API call that should be extendable later.
The new option introduced by Nicolas (
-febcdic-table
) is meant to configure the EBCDIC collating sequence only
Yes and for what it is used for (conversion in both directions) that's fine, non-latin would be something definitely for later.
Actually, we could even move the current hard-coded collating sequences to external files
That would mean to parse those files on each call of cobc - on first use.... maybe only keep the "standard" definition in and tall others as file?
Sounds reasonable.
The most important part here is that this would have to be done in the next 24 hours or wait for "quite some time later"....
Oh, well, I guess this will have to wait then.
In any case I suggest to add some API call that should be extendable later.
That's what I'm aiming for.
Just drop a note / request a review when you consider this clean enough to go and I'll check this out, responding quickly.
@GitMensch Not done yet, but feedback welcome. In particular, I'm not sure where to put the collating sequence initialization code (function init_collating_sequences
) in the generated program.
@GitMensch
I'm trying to better understand the original code and the recent changes made to EBCDIC support ; maybe you can enlighten me on something. In the version of output_param
before the recent EBCDIC patches :
cb_flag_alt_ebcdic
only considered in the case CB_TAG_ALPHABET_NAME
and not in the case CB_TAG_REFERENCE
/ CB_ALPHABET_NAME_P
?cb_flag_alt_ebcdic
considered in the case CB_ALPHABET_EBCDIC
but not in the case CB_ALPHABET_ASCII
? Shouldn't this be symmetric (i.e using two different pairs of EBCDIC tables depending on whether the flag is set or not) ?@GitMensch I'm trying to better understand the original code and the recent changes made to EBCDIC support ; maybe you can enlighten me on something. In the version of
output_param
before the recent EBCDIC patches :
- why is the flag
cb_flag_alt_ebcdic
only considered in the caseCB_TAG_ALPHABET_NAME
and not in the caseCB_TAG_REFERENCE
/CB_ALPHABET_NAME_P
?- why is the flag
cb_flag_alt_ebcdic
considered in the caseCB_ALPHABET_EBCDIC
but not in the caseCB_ALPHABET_ASCII
? Shouldn't this be symmetric (i.e using two different pairs of EBCDIC tables depending on whether the flag is set or not) ?
As far as I recall, the restricted table does not directly correspond to any native EBCDIC machine encoding, so the translation is never needed in one of the directions (to ASCII — that would actually be needed to collate ASCII items on a hypothetical restricted EBCDIC machine).
Note: the state on this PR - even with the requested changes - is quite good; and while it is a weekend... @ddeclerck @nberth the chances to get it into 3.2 after this weekend are... not good; so to make that available soon you possibly want to have a look at this.
Note: the state on this PR - even with the requested changes - is quite good; and while it is a weekend... @ddeclerck @nberth the chances to get it into 3.2 after this weekend are... not good; so to make that available soon you possibly want to have a look at this.
I think this should be mostly alright. I did what I could considering my current working conditions (I'm in a train at the moment, it's pretty noisy).
Still unsure about the get_collation_name
function : the codegen needs it to write the actual table name corresponding to the table id that is stored in the flag (or I could just have stored the flag as a string).
LGTM
Can you please add at least some function doc to the newly externalized
cob_get_collation_by_name
and tocob_get_collation_name
? Ideally we get some doc to gnucobol.texi - but this could be done next week, too-
I added comments in common.h
.
You may commit the change to 3.x now. Do you need help with the svn part?
Well, I'm not very used to SVN, but I have an up-to-date copy of the GnuCobol SVN repo. Should I just copy my files here and commit ?
Well, I'm not very used to SVN, but I have an up-to-date copy of the GnuCobol SVN repo. Should I just copy my files here and commit ?
cd branches/gnucobol-3.x # if you checked out everything, otherwise just cd into the right folder
svn update # never harms, should get you a commit from today
# overwrite the files and `svn delete` the old ones
svn status # check what you've got in the 3.x branch folder
svn commit
Warning! The base version is not the one with the 777 fix - so either svn update -r 4876
(which should get you the matching version), then place the files in, then svn update
, or creating a patch for this change and apply it to the current svn head.
Warning! The base version is not the one with the 777 fix - so either
svn update -r 4876
(which should get you the matching version), then place the files in, thensvn update
, or creating a patch for this change and apply it to the current svn head.
Yeah, I was concerned about that. I think I'll just extract a patch from my git tree (seems safer).
please try again, the commit only included the testsuite.
That's weird, only two files got transmitted. I'll check what has gone wrong (the poor internet connexion in the train does not help)
Ah, I was just not in the right directory...
Anyways, trying to commit, I now have a message complaining that the type of a file "unexpectedly changed". That file : ~ build_aux/ar-lib svn revert does not help here.
~ build_aux/ar-lib svn revert does not help here.
Does it help to rm -rf delete build_aux && svn update
?
~ build_aux/ar-lib svn revert does not help here.
Does it help to
rm -rf delete build_aux && svn update
?
Doesn't seem to.
Then go with svn update && svn status
Then svn commit cobc/* libcob/*
(or the list of files seen in svn status
).
I think that should be fine now.
now upstream
This allows to make anything using libcobol aware of the various EBCDIC encodings. In particular, we need this with the contributed tool GCSORT.