Open chriltola opened 1 month ago
The error you're encountering in SAS indicates that there is an encoding mismatch between the data file and the SAS session's encoding. The file CDWHCTL.CTL_DWH_DAILY_BATCH_DTTM
appears to be encoded in a different character set (wlatin1
), while your session is using UTF-8
.
To resolve this issue, you can try one of the following approaches:
You can convert the encoding of the dataset to match the session's encoding (UTF-8). Here is how you can do this in SAS:
libname source 'source-path' encoding=wlatin1;
libname target 'target-path' encoding=utf8;
proc copy in=source out=target;
select CTL_DWH_DAILY_BATCH_DTTM;
run;
This code will copy the dataset from the source
library with wlatin1
encoding to the target
library with UTF-8
encoding.
If you want to continue using the file in its original encoding, you can change the session encoding temporarily to match the dataset's encoding. However, this might require restarting your SAS session:
options encoding=wlatin1;
This is a temporary solution, and it means your session will run with wlatin1
encoding, so it won't be in UTF-8
until you change it back.
If most of your data is still in wlatin1
and you’re not ready to fully migrate to UTF-8
, consider setting the session encoding to wlatin1
permanently by updating your SAS configuration file (SASV9.cfg).
INENCODING
Option (For Reading Data)If you're only reading the data, you can use the inencoding
option:
data work.temp;
set source.CTL_DWH_DAILY_BATCH_DTTM(inencoding='wlatin1');
run;
This allows you to read data with a different encoding without changing the session's encoding.
Choose the method that best suits your situation depending on whether you want to convert the dataset, change the session encoding temporarily, or handle encoding differences on a case-by-case basis.
To change your existing data from wlatin1
to UTF-8
in SAS, you need to re-encode your datasets to match the new UTF-8
session encoding. Below are the steps to achieve this:
First, define two libraries in your SAS program: one for the existing data (with wlatin1
encoding) and another for the converted data (with UTF-8
encoding).
libname source 'path-to-wlatin1-data' encoding=wlatin1;
libname target 'path-to-utf8-data' encoding=utf8;
Replace 'path-to-wlatin1-data'
and 'path-to-utf8-data'
with the actual file paths to your source and target directories.
Next, you can use the PROC COPY
or DATA
step to copy and re-encode the datasets. Here's how to do it:
Option 1: Using PROC COPY
:
proc copy in=source out=target;
select CTL_DWH_DAILY_BATCH_DTTM; /* Add other datasets here if needed */
run;
Option 2: Using DATA
Step:
data target.CTL_DWH_DAILY_BATCH_DTTM;
set source.CTL_DWH_DAILY_BATCH_DTTM;
run;
After converting the datasets, it's essential to verify that the data has been correctly re-encoded:
proc contents data=target.CTL_DWH_DAILY_BATCH_DTTM;
run;
proc print data=target.CTL_DWH_DAILY_BATCH_DTTM(obs=5);
run;
Once you're satisfied with the conversion, you can replace the old data with the newly encoded data by moving the files from the target
library path back to the original location.
source
library and convert them automatically.Here’s an example loop for multiple datasets:
proc datasets lib=source nolist;
copy out=target;
select _all_;
quit;
This will copy all datasets from source
to target
, converting them from wlatin1
to UTF-8
.
This should allow you to transition your data to UTF-8
encoding smoothly while ensuring compatibility with your new SAS environment.
i try:
libname source base 'D:\DATAWAREHOUSE_PROD\Table\source' encoding=wlatin1;
libname target base 'D:\DATAWAREHOUSE_PROD\Table\target' encoding=utf8;
and got
237 /*---- Start of User Written Code ----*/
238
239 libname source base 'D:\DATAWAREHOUSE_PROD\Table\source' encoding=wlatin1;
________
22
ERROR: Libref SOURCE is not assigned.
ERROR: Error in the LIBNAME statement.
240 libname target base 'D:\DATAWAREHOUSE_PROD\Table\target' encoding=utf8;
________
22
ERROR: Libref TARGET is not assigned.
ERROR: Error in the LIBNAME statement.
ERROR 22-7: Invalid option name ENCODING.
241 /*---- End of User Written Code ----*/
The error you're encountering suggests that there might be an issue with the syntax or the path specification in your LIBNAME
statement. Specifically, the ENCODING
option is not recognized as valid in the context you used it.
Remove the base
keyword: The base
keyword is unnecessary in the LIBNAME
statement. Just specify the path directly.
Use the correct LIBNAME
statement: Here is the corrected LIBNAME
statement without the base
keyword:
libname source 'D:\DATAWAREHOUSE_PROD\Table\source' encoding=wlatin1;
libname target 'D:\DATAWAREHOUSE_PROD\Table\target' encoding=utf8;
Check the Directory Path: Ensure that the path D:\DATAWAREHOUSE_PROD\Table\source
and D:\DATAWAREHOUSE_PROD\Table\target
are correct and accessible from the SAS environment.
Encoding Option: If the ENCODING
option is still causing issues, you may need to ensure that your SAS version supports this option in the LIBNAME
statement. If not, an alternative is to handle encoding on a dataset-by-dataset basis using the DATA
step or PROC COPY
without specifying encoding in the LIBNAME
statement.
If you face issues with encoding in the LIBNAME
, you can omit it and handle encoding directly when processing the data:
libname source 'D:\DATAWAREHOUSE_PROD\Table\source';
libname target 'D:\DATAWAREHOUSE_PROD\Table\target';
data target.CTL_DWH_DAILY_BATCH_DTTM;
set source.CTL_DWH_DAILY_BATCH_DTTM(inencoding='wlatin1' outencoding='utf8');
run;
In this example:
inencoding='wlatin1'
specifies that the input data is in wlatin1
.outencoding='utf8'
specifies that the output data should be written in UTF-8
.This approach bypasses potential issues with the LIBNAME
statement and handles the encoding at the data level.
493 /*---- Start of User Written Code ----*/
494
495 libname source 'D:\DATAWAREHOUSE_PROD\Table\source' encoding=wlatin1;
________
22
ERROR: Libref SOURCE is not assigned.
ERROR: Error in the LIBNAME statement.
496 libname target 'D:\DATAWAREHOUSE_PROD\Table\target' encoding=utf8;
________
22
ERROR: Libref TARGET is not assigned.
ERROR: Error in the LIBNAME statement.
ERROR 22-7: Invalid option name ENCODING.
497 /*---- End of User Written Code ----*/
The error message indicates that the ENCODING
option is not being recognized in the LIBNAME
statement. This suggests that either:
ENCODING
option might not be valid in your SAS environment for the LIBNAME
statement.ENCODING
option to be unrecognized.Since the ENCODING
option is causing problems, you can instead handle the encoding at the dataset level without specifying it in the LIBNAME
statement. Here’s an alternative approach:
libname source 'D:\DATAWAREHOUSE_PROD\Table\source';
libname target 'D:\DATAWAREHOUSE_PROD\Table\target';
Use a DATA
step or PROC COPY
with encoding options directly on the datasets:
DATA
Stepdata target.CTL_DWH_DAILY_BATCH_DTTM (encoding=utf8);
set source.CTL_DWH_DAILY_BATCH_DTTM(inencoding='wlatin1');
run;
PROC COPY
with EncodingIf you want to copy multiple datasets:
proc copy in=source out=target;
select CTL_DWH_DAILY_BATCH_DTTM;
run;
Note: In this approach, the inencoding='wlatin1'
option is used directly on the dataset to specify the input encoding, while the output encoding is implicitly set by the target
library which is assumed to be UTF-8.
After running the above steps, verify that the data has been correctly re-encoded by examining the dataset in the target
library.
ENCODING
should be working as expected in the LIBNAME
statement. /*---- Start of User Written Code ----*/
746
747 data target.CTL_DWH_DAILY_BATCH_DTTM (encoding=utf8);
748 set source.CTL_DWH_DAILY_BATCH_DTTM(inencoding='wlatin1');
__________
22
ERROR 22-7: Invalid option name INENCODING.
749 run;
NOTE: Compression was disabled for data set TARGET.CTL_DWH_DAILY_BATCH_DTTM because compression overhead would increase the size of
the data set.
NOTE: The SAS System stopped processing this step because of errors.
WARNING: The data set TARGET.CTL_DWH_DAILY_BATCH_DTTM may be incomplete. When this step was stopped there were 0 observations and
0 variables.
WARNING: Data set TARGET.CTL_DWH_DAILY_BATCH_DTTM was not replaced because this step was stopped.
NOTE: PROCEDURE| _DISARM| STOP| _DISARM| 2024-08-11T16:54:58,871+07:00| _DISARM| WorkspaceServer| _DISARM| SAS| _DISARM| |
_DISARM| 11628544| _DISARM| 11628544| _DISARM| 11| _DISARM| 11| _DISARM| 1536| _DISARM| 2105656| _DISARM| 0.031250| _DISARM|
0.031000| _DISARM| 2038989298.841000| _DISARM| 2038989298.872000| _DISARM| 0.000000| _DISARM| | _ENDDISARM
NOTE: DATA statement used (Total process time):
real time 0.03 seconds
cpu time 0.03 seconds
750 /*---- End of User Written Code ----*/
The INENCODING
option is not valid in the DATA
step in SAS, which is why you're getting the error. To resolve this, you can use the PROC DATASETS
procedure to copy and convert the dataset from wlatin1
to UTF-8
.
Instead of using INENCODING
and ENCODING
options in the DATA
step, you should use PROC DATASETS
or PROC COPY
without specifying the encoding at the dataset level.
PROC DATASETS
with Re-encodinglibname source 'D:\DATAWAREHOUSE_PROD\Table\source' access=readonly;
libname target 'D:\DATAWAREHOUSE_PROD\Table\target';
proc datasets lib=source nolist;
modify CTL_DWH_DAILY_BATCH_DTTM / correctencoding=utf8;
copy out=target;
quit;
PROC COPY
libname source 'D:\DATAWAREHOUSE_PROD\Table\source' access=readonly;
libname target 'D:\DATAWAREHOUSE_PROD\Table\target';
proc copy in=source out=target;
select CTL_DWH_DAILY_BATCH_DTTM;
run;
correctencoding=utf8
: This option is used in PROC DATASETS
to ensure that the dataset's encoding is corrected when necessary.PROC COPY
: Copies the dataset(s) from source
to target
, assuming that the target
library uses UTF-8 encoding.access=readonly
: Ensures that the source library is accessed in a read-only mode, which can help prevent accidental modifications.These methods should allow you to re-encode your data correctly and avoid the errors you've encountered with previous approaches.
If the dataset was copied but contains no data in the target library, there are a few possible reasons for this, and we can take steps to diagnose and resolve the issue.
Source Dataset is Empty:
CTL_DWH_DAILY_BATCH_DTTM
) actually contains data.SAS Code to Check:
proc contents data=source.CTL_DWH_DAILY_BATCH_DTTM;
run;
proc print data=source.CTL_DWH_DAILY_BATCH_DTTM(obs=10);
run;
Data Loss During Encoding Conversion:
wlatin1
to UTF-8
might cause issues if there are characters that do not translate correctly.Issue in Copying Process:
Encoding Mismatch Still Present:
Double-Check Encoding Settings: Even though you've converted the dataset, there might still be an encoding mismatch. Verify the encodings by inspecting both libraries and datasets:
proc contents data=source.CTL_DWH_DAILY_BATCH_DTTM;
run;
proc contents data=target.CTL_DWH_DAILY_BATCH_DTTM;
run;
source
library is indeed using wlatin1
and the target
library is using UTF-8
.Library Permissions or Path Issues:
D:\DATAWAREHOUSE_PROD\Table\target
) has the appropriate write permissions and is accessible.Print the First Few Rows from the Source and Target:
To ensure that data is present and correctly copied, print the first few rows of both the source and target datasets:
proc print data=source.CTL_DWH_DAILY_BATCH_DTTM(obs=10);
run;
proc print data=target.CTL_DWH_DAILY_BATCH_DTTM(obs=10);
run;
Check the Number of Observations:
proc sql;
select count(*) as Source_Count from source.CTL_DWH_DAILY_BATCH_DTTM;
select count(*) as Target_Count from target.CTL_DWH_DAILY_BATCH_DTTM;
quit;
Review the SAS Log:
These steps should help identify where the issue lies and guide you in resolving it. Let me know if you find anything unusual during these checks!
651 /*---- Start of User Written Code ----*/
652
653 /*
654 libname CDWHCTL 'D:\DATAWAREHOUSE_PROD\Control\DWH';
655 libname target 'D:\DATAWAREHOUSE_PROD\Table\target';
656
657 proc datasets lib=CDWHCTL nolist;
658 modify CTL_DWH_DAILY_BATCH_DTTM / correctencoding=utf8;
659 copy out=target;
660 quit;\\*/
661
662 libname CDWHCTL 'D:\DATAWAREHOUSE_PROD\Control\DWH';
NOTE: Libref CDWHCTL refers to the same physical library as CDWHCTL.
NOTE: Libref CDWHCTL was successfully assigned as follows:
Engine: V9
Physical Name: D:\DATAWAREHOUSE_PROD\Control\DWH
663 libname target 'D:\DATAWAREHOUSE_PROD\Table\target';
NOTE: Libref TARGET was successfully assigned as follows:
Engine: V9
Physical Name: D:\DATAWAREHOUSE_PROD\Table\target
664
665 proc copy in=CDWHCTL out=target;
666 select CTL_DWH_DAILY_BATCH_DTTM;
667 run;
NOTE: Copying CDWHCTL.CTL_DWH_DAILY_BATCH_DTTM to TARGET.CTL_DWH_DAILY_BATCH_DTTM (memtype=DATA).
NOTE: There were 3 observations read from the data set CDWHCTL.CTL_DWH_DAILY_BATCH_DTTM.
NOTE: The data set TARGET.CTL_DWH_DAILY_BATCH_DTTM has 3 observations and 2 variables.
NOTE: Compressing data set TARGET.CTL_DWH_DAILY_BATCH_DTTM increased size by 100.00 percent.
Compressed is 2 pages; un-compressed would require 1 pages.
NOTE: PROCEDURE COPY used (Total process time):
real time 0.03 seconds
cpu time 0.01 seconds
NOTE: PROCEDURE| _DISARM| STOP| _DISARM| 2024-08-11T17:20:00,166+07:00| _DISARM| WorkspaceServer| _DISARM| SAS| _DISARM| |
_DISARM| 15044608| _DISARM| 14778368| _DISARM| 10| _DISARM| 10| _DISARM| 662016| _DISARM| 11828536| _DISARM| 0.015625|
_DISARM| 0.047000| _DISARM| 2038990800.119000| _DISARM| 2038990800.166000| _DISARM| 0.000000| _DISARM| | _ENDDISARM
668
669 proc sql;
670 select count(*) as Source_Count from CDWHCTL.CTL_DWH_DAILY_BATCH_DTTM;
671 select count(*) as Target_Count from target.CTL_DWH_DAILY_BATCH_DTTM;
672 quit;
NOTE: The PROCEDURE SQL printed pages 1-2.
NOTE: PROCEDURE| _DISARM| STOP| _DISARM| 2024-08-11T17:20:00,228+07:00| _DISARM| WorkspaceServer| _DISARM| SAS| _DISARM| |
_DISARM| 20291584| _DISARM| 15044608| _DISARM| 10| _DISARM| 10| _DISARM| 279552| _DISARM| 12108600| _DISARM| 0.031250|
_DISARM| 0.062000| _DISARM| 2038990800.166000| _DISARM| 2038990800.228000| _DISARM| 0.031250| _DISARM| | _ENDDISARM
NOTE: PROCEDURE SQL used (Total process time):
real time 0.06 seconds
cpu time 0.03 seconds
673 /*---- End of User Written Code ----*/
The output shows that the dataset was successfully copied from the CDWHCTL
library to the target
library. Here are the key points from the log:
Data Copy Confirmation:
PROC COPY
step successfully copied the dataset CTL_DWH_DAILY_BATCH_DTTM
from the CDWHCTL
library to the target
library.Data Count Verification:
PROC SQL
query confirmed that both the source (CDWHCTL.CTL_DWH_DAILY_BATCH_DTTM
) and target (target.CTL_DWH_DAILY_BATCH_DTTM
) datasets contain 3 observations.Since the data was copied successfully and the number of observations matches, everything seems to be working correctly. Here are a few additional things you can do to ensure everything is as expected:
Inspect Data in Target Library:
proc print data=target.CTL_DWH_DAILY_BATCH_DTTM(obs=5);
run;
Check Data Integrity:
proc contents data=target.CTL_DWH_DAILY_BATCH_DTTM;
run;
proc print data=target.CTL_DWH_DAILY_BATCH_DTTM(obs=5); run;
Compression Warning:
compress=no
option when creating the dataset:
data target.CTL_DWH_DAILY_BATCH_DTTM(compress=no);
set CDWHCTL.CTL_DWH_DAILY_BATCH_DTTM;
run;
Everything appears to have been copied successfully with the correct number of observations. If you need further verification or encounter any specific issues with the data, the steps above should help you investigate further.
To migrate a table from wlatin1
to UTF-8
encoding, you can follow a systematic approach to ensure the data is correctly converted and maintained. Here’s a step-by-step guide tailored for SAS:
Before starting the migration, ensure you have a backup of your original dataset.
Define your source and target libraries. Assume your source library uses wlatin1
encoding and your target library will use UTF-8
.
libname source 'D:\DATAWAREHOUSE_PROD\Table\source' encoding=wlatin1;
libname target 'D:\DATAWAREHOUSE_PROD\Table\target' encoding=utf8;
Verify that the source dataset contains data and that it is readable. Check the structure and a few records to ensure there are no hidden issues.
proc contents data=source.CTL_DWH_DAILY_BATCH_DTTM;
run;
proc print data=source.CTL_DWH_DAILY_BATCH_DTTM(obs=10);
run;
Use the PROC COPY
procedure or DATA
step to copy and convert the dataset. Since the PROC COPY
doesn’t support direct encoding conversion, you need to use the DATA
step or create a new dataset in the target library.
DATA
StepYou can copy and convert the dataset while creating it in the target library:
data target.CTL_DWH_DAILY_BATCH_DTTM;
set source.CTL_DWH_DAILY_BATCH_DTTM;
run;
PROC COPY
(If Suitable)If you use PROC COPY
, ensure that the target
library's encoding is set to UTF-8
:
proc copy in=source out=target;
select CTL_DWH_DAILY_BATCH_DTTM;
run;
Check the target dataset to ensure that the data was copied correctly and that the encoding is appropriate.
proc contents data=target.CTL_DWH_DAILY_BATCH_DTTM;
run;
proc print data=target.CTL_DWH_DAILY_BATCH_DTTM(obs=10);
run;
Ensure that any special or non-ASCII characters are properly handled. Verify that characters are correctly represented in the UTF-8
encoding.
If you have data dictionaries or metadata documentation, update them to reflect the new encoding.
Ensure that any applications, reports, or processes that use this dataset are tested with the new UTF-8
encoded data to confirm they function correctly.
wlatin1
might not have a direct representation in UTF-8
, so ensure that any data integrity issues are resolved.Following these steps should help you migrate your data from wlatin1
to UTF-8
encoding successfully.
i have following hex value in excel "48415920534F4B48414E" and i want to decode it to decimal value. how can i get?
If you want to decode the entire hex string "48415920534F4B48414E" into a text string without splitting it into pairs, you can use the following approach in Excel:
Enter the Hex String in a Cell:
A1
.Use the Following Formula to Decode:
You can use a combination of Excel functions to convert the hex string directly to text:
=TEXTJOIN("", TRUE, CHAR(HEX2DEC(MID(A1, SEQUENCE(LEN(A1)/2, 1, 1, 2), 2))))
Explanation:
SEQUENCE(LEN(A1)/2, 1, 1, 2)
: Generates a sequence of numbers corresponding to the starting positions of each pair of hex digits.MID(A1, SEQUENCE(...), 2)
: Extracts each pair of hex digits.HEX2DEC(...)
: Converts each hex pair to its decimal ASCII code.CHAR(...)
: Converts the ASCII code to the corresponding character.TEXTJOIN("", TRUE, ...)
: Joins all the characters into a single string.This formula will return the decoded string "HAY SOKHAN"
.
I apologize for the confusion. To convert the entire hex string "48415920534F4B48414E" to its corresponding text, you can use the following approach in Excel:
Enter the Hex String in a Cell:
A1
.Use the Following Formula to Convert Hex to Text:
Here's a step-by-step process to decode the hex string to a text string:
=TEXTJOIN("", TRUE, MID(A1, SEQUENCE(LEN(A1)/2, 1, 1, 2), 2))
Explanation:
SEQUENCE(LEN(A1)/2, 1, 1, 2)
: Generates a sequence of starting positions for each pair of hex digits.MID(A1, SEQUENCE(...), 2)
: Extracts each pair of hex digits.HEX2DEC(...)
: Converts each hex pair to its decimal ASCII code.CHAR(...)
: Converts the ASCII code to the corresponding character.TEXTJOIN("", TRUE, ...)
: Joins all characters into a single string.This formula will return the decoded string "HAY SOKHAN".
Yes, VBA (Visual Basic for Applications) in Excel can decode a hex string to a text string. Here’s how you can do it:
Open the VBA Editor:
ALT + F11
to open the VBA editor.Insert > Module
.Insert the Following VBA Code:
Function HexToString(HexStr As String) As String
Dim i As Integer
Dim OutputStr As String
OutputStr = ""
' Loop through each pair of hex digits
For i = 1 To Len(HexStr) Step 2
' Convert each pair of hex digits to a character
OutputStr = OutputStr & Chr(CLng("&H" & Mid(HexStr, i, 2)))
Next i
HexToString = OutputStr
End Function
Use the Function in Excel:
HexToString
function in Excel.For example, if your hex string is in cell A1
, use the following formula in another cell:
=HexToString(A1)
This will correctly decode the hex string "48415920534F4B48414E"
into the text string "HAY SOKHAN"
.
HexToString
takes the hex string as input.Chr
and concatenating them to form the final output string.This method should correctly convert any hex string into its corresponding text in Excel.
To decode a Base64-encoded hex string using VBA in Excel, you can follow the steps below. This approach first converts the hex string to binary data and then decodes it from Base64.
Open the VBA Editor:
ALT + F11
to open the VBA editor.Insert > Module
.Insert the Following VBA Code:
Function Hex64ToString(Hex64Str As String) As String
Dim i As Integer
Dim BinaryData() As Byte
Dim OutputStr As String
' Convert Hex string to binary data
ReDim BinaryData(Len(Hex64Str) / 2 - 1)
For i = 1 To Len(Hex64Str) Step 2
BinaryData((i - 1) / 2) = CByte("&H" & Mid(Hex64Str, i, 2))
Next i
' Decode binary data from Base64
Dim Base64Encoded As String
Base64Encoded = WorksheetFunction.EncodeBase64(BinaryData)
' Convert the Base64 decoded data to string
OutputStr = WorksheetFunction.DecodeBase64(Base64Encoded)
Hex64ToString = OutputStr
End Function
Use the Function in Excel:
Hex64ToString
function in your Excel sheet.For example, if your Base64-encoded hex string is in cell A1
, use the following formula:
=Hex64ToString(A1)
This function assumes that your hex string is Base64-encoded. However, if the string is only in hex and you wish to decode it directly, you should skip the Base64 part and directly convert the hex to text.
If your string is purely hex and not Base64-encoded, the correct approach depends on the exact format of the data you're dealing with.
correct email
Dear Mr. Ngoun,
Currently, SAS system does not support Khmer character yet. Thus, to handle this case, Campu ITD will extract fields which issued with Khmer character into Hex value.
Once SAS system support Khmer character, Campu ITD will perform migration by decode those hex value into Khmer character.
If you have any question, pls kindly let me know.
Thanks, Best Regards, Tola
Your email is clear, but I’ve made a few minor adjustments for clarity and formality:
Subject: Handling Khmer Characters in SAS System
Dear Mr. Ngoun,
I hope this message finds you well.
Currently, the SAS system does not support Khmer characters. To address this, Campu ITD will extract the fields containing Khmer characters and convert them into Hex values.
Once the SAS system supports Khmer characters, Campu ITD will migrate the data by decoding those Hex values back into Khmer characters.
If you have any questions, please feel free to reach out to me.
Thank you.
Best regards,
Tola
This version adds a subject line, improves the flow, and ensures the tone is formal and polite.
ERROR: File CDWHCTL.CTL_DWH_DAILY_BATCH_DTTM cannot be updated because its encoding does not match the session encoding or the file is in a format native to another host, such as WINDOWS_64.