Closed natalie-o-perret closed 9 months ago
Hi @natalie-o-perret , when writing the XML files composing the XLSX package I have managed to produce the smallest set of files and attributes that made Excel and LibreOffice happy. I may have missed something that the standard mandates for a proper XLSX file, so maybe SAP is really looking for something that is missing or different. Please allow me to compare your versions and try to guess it, but I need your help to make some checks because I have no SAP at disposal. Thanks, Salvo
Hi @salvois,
Hi @natalie-o-perret , when writing the XML files composing the XLSX package I have managed to produce the smallest set of files and attributes that made Excel and LibreOffice happy. I may have missed something that the standard mandates for a proper XLSX file, so maybe SAP is really looking for something that is missing or different. Please allow me to compare your versions and try to guess it, but I need your help to make some checks because I have no SAP at disposal. Thanks, Salvo
So glad to hear your feedback! Sure, lemme know if you have some ideas of what might help SAP to see the files as proper Excel files.
Both files are available in my OP:
I also have access to a test/dev instance if you need me to try out some new types of generated files.
Hi @natalie-o-perret , I have made some changes to the XML files generated by LargeXlsx to make them closer to the ones generated by Excel. Are you able to checkout the master branch and include LargeXlsx as a source project in your project to try it? Thanks, Salvo
Hi @natalie-o-perret , I have just released version 1.7.0 which includes the changes I mentioned in my last comment. Pleas let me know whether it helps with your problem. Thanks, Salvo
Hi @natalie-o-perret , I have just released version 1.7.0 which includes the changes I mentioned in my last comment. Pleas let me know whether it helps with your problem. Thanks, Salvo
Hi @salvois, sorry to only get back to you """now""".
Long story short, no, it still doesn't address the issue.
So, after banging my head against the wall countless times at night and tweaking the generations to leverage systematically shared strings whenever there are string (SAP-journal-entries-excel-file compliance is bonkers), I think I've finally found the issue. Also had to tweak the way the styles are passed.
*.xlsx
(*.zip
) filesWith the new implementation, we can get a file like this: test.xlsx
If you open it with say, 7-Zip, you will see:
As you can see, the entries / files are marked in the archive with the attributes: -r--------
.
If I extract the files of the archive and stuffing them back in the archive (or a new one): test - but sap-compliant.xlsx
You will then see:
the entries are now marked with the attribute N
.
Which kind hints me that there is something off with the zip dependency used by LargeXlsx, i.e. SharpCompress
and the library doesn't come with a lot of options. And it doesn't seem, to my knowledge (correct me if I'm wrong) that we can't set attributes to the archive entries / files.
An alternative that might work would be to lever ICSharpCode.SharpZipLib
that has among other things, a ZipEntry
type that carries more properties as you can see in the doc: http://icsharpcode.github.io/SharpZipLib/api/ICSharpCode.SharpZipLib.Zip.ZipEntry.html
Hi @natalie-o-perret , you did a really impressive investigation job, thanks! Please allow me to try to summarize them to make sure I'm understanding your findings correctly. You have found a way to make SAP happy about files created by LargeXlsx by just tweaking the way you use the library, that is by enforcing use of shared strings and some specific styles. Moreover you have found that the zip file (the actual XLSX file) produced by LargeXlsx is not compatible with SAP and you need to unzip and zip it again with a third party zip compressor. Is this summary correct? Do you need both of those? I remember having similar issues with very large files on LibreOffice due to the second (the zip one) issue. I see that the author of SharpCompress himself is considering SharpZipLib for his zip backend, so maybe writing a version of LargeXlsx using it is worth a try. Thanks, Salvo
Hi @natalie-o-perret , you did a really impressive investigation job, thanks! Please allow me to try to summarize them to make sure I'm understanding your findings correctly. You have found a way to make SAP happy about files created by LargeXlsx by just tweaking the way you use the library, that is by enforcing use of shared strings and some specific styles. Moreover you have found that the zip file (the actual XLSX file) produced by LargeXlsx is not compatible with SAP and you need to unzip and zip it again with a third party zip compressor. Is this summary correct? Do you need both of those? I remember having similar issues with very large files on LibreOffice due to the second (the zip one) issue. I see that the author of SharpCompress himself is considering SharpZipLib for his zip backend, so maybe writing a version of LargeXlsx using it is worth a try. Thanks, Salvo
Hey @salvois, yea that took me quite a (very) long while to figure most of the stuff out. And yea you got the whole thing right ~~
Hi @natalie-o-perret , I have just pushed a version of LargeXlsx which replaces SharpCompress with ICSharpCode.SharpZipLib to pack the XLSX file. Could you try that version from sources to check whether that helps with your problem? You find it in the feature/sharpziplib branch. As far as I'm concerned, I've noticed that SharpZipLib presents the same problem that SharpCompress used to in a previous version, that is in ZIP64 mode they save zip headers in a way that LibreOffice considers invalid, thus I could not adopt SharpZipLib lightly, but that could be a start. Thanks, Salvo
Hi @natalie-o-perret , I have just pushed a version of LargeXlsx which replaces SharpCompress with ICSharpCode.SharpZipLib to pack the XLSX file. Could you try that version from sources to check whether that helps with your problem? You find it in the feature/sharpziplib branch. As far as I'm concerned, I've noticed that SharpZipLib presents the same problem that SharpCompress used to in a previous version, that is in ZIP64 mode they save zip headers in a way that LibreOffice considers invalid, thus I could not adopt SharpZipLib lightly, but that could be a start. Thanks, Salvo
Hey 🙋♀️ @salvois, thanks a ton again!🙇♀️
I'm gonna try to work something out with your feature/sharpziplib
branch, keep you posted 🪖
Hi @natalie-o-perret , a quick note, just in case you didn't notice, that I merged the last changes to the feature/sharpziplib branches.
Hi @natalie-o-perret , a quick note, just in case you didn't notice, that I merged the last changes to the feature/sharpziplib branches.
Thanks, I'm not quite yet there, I've played a bit before you've merged master
to feature/sharpziplib
, but haven't managed to make it worked / SAP-upload-compliant so far.
Basically what I've been doing so far,
Creating an extension method like below:
using ICSharpCode.SharpZipLib.Zip;
namespace LargeXlsx
{
internal static class ZipOutputStreamExtensions
{
public static void PutNextMsDosEntry(this ZipOutputStream source, string name, int fileAttributes, HostSystemID hostSystemId = HostSystemID.Msdos)
{
var zipEntry = new ZipEntry(name)
{
HostSystem = (int)hostSystemId,
ExternalFileAttributes = fileAttributes
};
source.PutNextEntry(zipEntry);
}
}
}
And then used it in res.
SharedStringTable.cs
with zipOutputStream.PutNextMsDosEntry("xl/sharedStrings.xml", 0);
Stylesheet.cs
with zipOutputStream.PutNextMsDosEntry("xl/styles.xml", 0);
Worksheet.cs
with zipOutputStream.PutNextMsDosEntry($"xl/worksheets/sheet{id}.xml", 32);
XlsxWriter.cs
with
_zipOutputStream.PutNextMsDosEntry("[Content_Types].xml", 32);
_zipOutputStream.PutNextMsDosEntry("_rels/.rels", 0);
_zipOutputStream.PutNextMsDosEntry("xl/workbook.xml", 0);
_zipOutputStream.PutNextMsDosEntry("xl/_rels/workbook.xml.rels", 0);
Hi @natalie-o-perret , I'm closing this issue because, for the time being, I'm not able to provide further assistance based on current information. Please feel free to reopen in case you have further details to share and possibly delve into. Thanks, Salvo
Hi @salvois,
I'm re-opening this issue, as I was delving once again into these shenanigans lately, turns out the issue (for SAP) is in the construction of the sheet1.xml
The differences are essentially two-folded, it boils down to the same thing proper marking of coordinates for both rows and cells.
They both should have the the proper xml attribute for coordinates, e.g.,
<row r="19">
<c r="B19" s="7" t="s">
)[EDIT] Ended up doing something similar to https://github.com/salvois/LargeXlsx/pull/31
systematically.
Hello @natalie-o-perret , I'm glad you have found your issue. This puzzles me, though, because your original issue report was on release 1.6.3, which did produce row and cell references, while release 1.7.1, which included @MarkPflug optimization to remove unnecessary row and cell references, was released only on 2023-02-26, that is after your initial tests. Would you mind double checking whether 1.7.0 does indeed work for you? Thanks, Salvo
For what it's worth, in ECMA-376 the "r" attribute is defined as use="optional"
on both the CT_Row and CT_Cell types. Probably not worth much when things aren't working for you though.
Hello @natalie-o-perret , I'm glad you have found your issue. This puzzles me, though, because your original issue report was on release 1.6.3, which did produce row and cell references, while release 1.7.1, which included @MarkPflug optimization to remove unnecessary row and cell references, was released only on 2023-02-26, that is after your initial tests. Would you mind double checking whether 1.7.0 does indeed work for you? Thanks, Salvo
@salvois I think there was some mixed up at the time on my end, I mean when you spend your time comparing and adjusting what bits (e.g., SAP doesn't support inline strings in worksheet xml's, they have to be part of the SharedStrings.xml) makes SAP rejects and doesn't, it's a lot of shots in the dark, countless *.xlxs
unzipped and dissected, just because the support can't address the issue nor the business willing to change how they operate 🤷♀️. What I can guarantee is that now it works by using the same trick that has been done in https://github.com/salvois/LargeXlsx/pull/31
For what it's worth, in ECMA-376 the "r" attribute is defined as
use="optional"
on both the CT_Row and CT_Cell types. Probably not worth much when things aren't working for you though.
@MarkPflug Well, my not-so-wild guess is that SAP probably just use some libraries that aren't properly compliant, it's a Saas and closed source anyway.
Hello @natalie-o-perret , I have finally had a chance to merge #31 , so the current master lets you force LargeXlsx write row and cell references even when redundant. That would be included in the next release. Please feel free to give it a try at your convenience. Thanks, Salvo
@salvois hey there 🙋♀️ , will check this out. Thanks for the hard work 🙇♀️
Hi @natalie-o-perret , I have just released release 1.9.0 which allows to include row and cell references even when redundant. Please feel free to check whether that solves your issue. Thanks, Salvo
Hi @natalie-o-perret , I'm closing this issue. Should the problem persist, please reopen it. Thanks, Salvo
Hi 🙋♀️,
I'm using your awesome LargeXlsx library (version 1.6.3) with .NET 7.0.101 to generate SAP journal entries files that are then uploaded to SAP (accordingly to their their doc).
Expand / Collapse the C# implementation
`AccrualsGenerator.cs`: ```csharp using LargeXlsx; using SharpCompress.Compressors.Deflate; namespace ExcelMeh; public static class AccrualsGenerator { public static void Generate(string path) { using var stream = new FileStream(path, FileMode.Create, FileAccess.Write); using var xlsxWriter = new XlsxWriter(stream, CompressionLevel.BestCompression, true); xlsxWriter.SetDefaultStyle(Constants.Styles.Default) .BeginWorksheet(Constants.Values.WorksheetName, columns: Constants.ColumnsMin) .BeginRow() .Write(Constants.Values.Title, Constants.Styles.Title) .BeginRows(Constants.Values.Comments, Constants.Styles.Bold) .BeginRow() .Write(Constants.Values.BatchId, Constants.Styles.DarkYellowBg) .Write(Constants.Styles.DarkYellowBg) .SkipRows(2) .BeginRow(style: Constants.Styles.DarkBlueBoldBgLeft) .Write(1, Constants.Styles.DarkBlueBoldBg) .WriteSharedString(Constants.Values.Header, Constants.Styles.DarkBlueBoldBg) .BeginRow() .SkipColumns(1).WriteSharedStrings(Constants.Values.HeaderColumNamesMin, Constants.Styles.LightBlueBg) .BeginRow() .SkipColumns(1).WriteSharedStrings(Constants.Values.HeaderColumDescriptionsMin, Constants.Styles.LightBlueBg) .BeginRow() .SkipColumns(1) .Write("FR01") .Write("AD") .Write(new DateTime(2023, 01, 01), Constants.Styles.ShortDate) .Write(new DateTime(2023, 01, 01), Constants.Styles.ShortDate) .Write("MERC11-1111444-55500") .Write("EUR") .SkipRows(1) .BeginRow() .SkipColumns(1) .WriteSharedString(Constants.Values.LineItems, Constants.Styles.BlueBg) .BeginRow() .SkipColumns(4) .WriteSharedString(Constants.Values.TransactionCurrency, Constants.Styles.LightBlueBg, 2) .BeginRow() .SkipColumns(1) .WriteSharedStrings(Constants.Values.LineItemColumnNamesMin, Constants.Styles.LightBlueBg) .BeginRow() .SkipColumns(1) .WriteSharedStrings(Constants.Values.LineItemColumnDescriptionsMin, Constants.Styles.LightBlueBg) .BeginRow() .SkipColumns(1) .Write("FR01") .Write("63333333") .Write("FNP-MERC-4242424-23232-898989/0101-E3 Meow") .Write(35.09) .SkipColumns(1) .Write("V0") .Write("FR01010002_0013") .Write("7226") .BeginRow() .SkipColumns(1) .Write("FR01") .Write("11100") .Write("FNP-MERC-4242424-23232-898989/0101-E3 Meow") .Write(7.02) .SkipColumns(1) .Write("V0") .Write("FR01010002_0013") .Write("7226") .BeginRow() .SkipColumns(1) .Write("FR01") .Write("240205") .Write("FNP-MERC-4242424-23232-898989/0101-E3 Meow") .SkipColumns(1) .Write(42.11) .Write("V0") .Write("FR01010002_0013") .Write("7226") .BeginRow(); } } ``` `Constants.cs`: ```csharp using System.Drawing; using LargeXlsx; namespace ExcelMeh; public static class Constants { public static class Colors { public static readonly Color DarkYellow = Color.FromArgb(255, 192, 0); public static readonly Color Blue = Color.FromArgb(180, 198, 231); public static readonly Color DarkBlue = Color.FromArgb(142, 169, 219); public static readonly Color LightBlue = Color.FromArgb(237, 241, 249); public static readonly Color ForestGreen = Color.FromArgb(169, 208, 142); } public static class Fills { public static readonly XlsxFill DarkYellow = new(Colors.DarkYellow); public static readonly XlsxFill DarkBlue = new(Colors.DarkBlue); public static readonly XlsxFill Blue = new(Colors.Blue); public static readonly XlsxFill LightBlue = new(Colors.LightBlue); public static readonly XlsxFill ForestGreen = new(Colors.ForestGreen); } public static class Alignments { public static readonly XlsxAlignment Left = new(XlsxAlignment.Horizontal.Left); public static readonly XlsxAlignment Center = new(XlsxAlignment.Horizontal.Center); public static readonly XlsxAlignment Right = new(XlsxAlignment.Horizontal.Right); } public static class Fonts { public const string Name = "Calibri"; public const double Size = 10; public const double TitleSize = 13; public const double BoldSize = 11; public static readonly XlsxFont Default = new(Name, Size, Color.Empty); public static readonly XlsxFont Title = Default.WithSize(TitleSize); public static readonly XlsxFont Bold = Default.WithSize(BoldSize).WithBold(); } public static class NumberFormats { public const string ShortDatePattern = "DD/MM/YYYY"; public static readonly XlsxNumberFormat ShortDate = new(ShortDatePattern); } public static class Styles { public static readonly XlsxStyle Default = XlsxStyle.Default.With(Fonts.Default); public static readonly XlsxStyle Title = XlsxStyle.Default.With(Fonts.Title); public static readonly XlsxStyle Bold = XlsxStyle.Default.With(Fonts.Bold); public static readonly XlsxStyle DarkYellowBg = XlsxStyle.Default.With(Fills.DarkYellow); public static readonly XlsxStyle DarkBlueBoldBg = Bold.With(Fills.DarkBlue).With(Alignments.Right); public static readonly XlsxStyle DarkBlueBoldBgLeft = Bold.With(Fills.DarkBlue).With(Alignments.Left); public static readonly XlsxStyle BlueBg = Default.With(Fills.Blue).With(Alignments.Left); public static readonly XlsxStyle LightBlueBg = Default.With(Fills.LightBlue).With(Alignments.Center); public static readonly XlsxStyle LightGreenBg = Default.With(Fills.ForestGreen).With(Alignments.Center); public static readonly XlsxStyle ShortDate = Default.With(NumberFormats.ShortDate); } private const double InToPxRatio = 13; public static readonly IReadOnlyCollectionIf I'm trying to use the file as-is after they have been generated to push them to SAP, it doesn't work:
But if I open them with Libre Office or MS Office and save them, I can then push to SAP:
File Examples:
Expand / Collapse to see the substantial changes found between the two files / archives (since `*.xslx` are `*.zip` containing `*.xml`'s in disguise)
`test - original\xl\styles.xml`: ```xmlI was wondering if you had any idea about the difference provided when Libre Office has saved the original file to copy 🤔. I mean there are changes, but it's still "kinda" the same thing but different enough that SAP can accept the file 🤔
Would like to avoid resaving the file via Libre / MS Office to be able to push those files to SAP.
Any idea / hint?