dotnet / Open-XML-SDK

Open XML SDK by Microsoft
https://www.nuget.org/packages/DocumentFormat.OpenXml/
MIT License
3.99k stars 544 forks source link

The optimized path is determined by NULL #1773

Open waydalee opened 3 weeks ago

twsouthwick commented 3 weeks ago

Can you provide details as to why you see this as a needed change?

waydalee commented 3 weeks ago

您能否详细说明为什么您认为这是一项必要的更改?

并非所有值都是"NULL",我在使用过程中有碰到"XXX\XXX\NULL"这种值,如果没有过滤掉,会导致后续功能报错

twsouthwick commented 3 weeks ago

Can you provide a sample document that has this? How was it created?

@tomjebo is this kind of thing expected from the spec?

waydalee commented 3 weeks ago

我是在生产过程中碰到这种情况,暂时没有现成的案例。“XXX\XXX\NULL”是符合规范的吧?

tomjebo commented 2 weeks ago

So I'm not sure where the "NULL" comes from in the original code:

                    if (!relationship.TargetUri.ToString().Equals("NULL", StringComparison.OrdinalIgnoreCase))

Should this really be:

                    if (!relationship.TargetUri.ToString().Equals("", StringComparison.OrdinalIgnoreCase))

An actual null byte in an URI isn't a problem on the surface although it does raise questions about security for me, aka null byte injection. But technically a null byte inside or at the end of a URI is valid. An URI that is only a null byte, for example an empty string is not a valid URI and Sytem.Uri won't construct it.

@twsouthwick I searched our code and can't find where we might inject "NULL" into or in place of a Uri. Maybe I'm just missing it?

tomjebo commented 2 weeks ago

@waydalee Please provide a document that has this condition ("NULL" in the relationship target URI).

waydalee commented 2 weeks ago

新建 Microsoft Word 文档.docx

tomjebo commented 2 weeks ago

@waydalee did you create Microsoft.Word.docx with Microsoft Word or did you create it with code? Specifically, did Microsoft Word add:

 <Relationship Id="rId4" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/image" Target="media/NULL"/>

If Word added them can you tell me what steps you took to get Word to create them?

I've been playing with Word and trying different "NULL" strings added to relationship targets or replacing them with "NULL". Word seems to ignore these kinds of targets but also doesn't respect the relationship. But I'm not sure that Word does this itself. As a matter of fact, I can add any invalid string to a relationship and Word will have the same behavior. I'll check with the Word product team to see if they recognize this.

waydalee commented 2 weeks ago

是通过直接修改xml文件做的数据,可能不太恰当;由于是在生产过程中碰到的异常情况,当时情况紧急,通过调试发现文档中含有“XXX\XXX\NULL”,就直接修改源码编译成本地dll,后续的生产过程(超过一年)没有再发生过此类异常情况,所有手上没有真实的案例文档。

tomjebo commented 2 days ago

@waydalee I found out more about the history of "NULL" in relationship target URIs. Office interprets a target URI containing only the string "NULL" to mean that the target points to nothing (no part). It was likely added for or by virus checkers to remove malicious URIs. However, target URIs that end with "NULL" but also contain otherwise valid URI characters are considered by Office to be valid target URIs and therefore, I would not approve adding this PR's proposed change.

waydalee commented 2 days ago

According to Office product team, relationship targets with valid URI characters in addition to "NULL" (ending in "NULL") should be considered valid targets for loading.

如果碰到以NULL结尾的URI,会报异常:“Specified part does not exist in the package.”。是否应该修复这个异常?

waydalee commented 2 days ago

我更关心的是SDK能兼容office能处理的文档