Closed Monirzadeh closed 1 year ago
@fmartingr do you have idea how should i update unit test for this situation or i should just remove this for new logic?
id should pass some rules. if one of this not pass we can't show images correctly on all device.
abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789-_:.
in old method it use filename as id. before add file to opf
pass filename to fixXMLid
to check all rules and fix them if it is not suitable. for example add id
to the first, if file name not start with English alphabet.
but it is not done that prefect.
for example if filename is something like this filename-with-حروف-غیر-انگلیسی.jpg
old method can't handle that and not remove that character is not acceptable. if filename is just contain not acceptable character it can't handle that just removing them is not a good solution and detect that in an string can be challenging (for each language in the world that someone named a file).
we need a more reliable way to handle that is to always create acceptable id for files.
why we change attribute?
to show up correctly image in ebook file img
should work in specific way ,if not we cant show images on all device device.
img
should contain at least one src
when point to a local image if it is just contain data-src
that point to a local image we can't show that correctly on all devicesrc
than how many data-src
needed if this order is reverse some ebook-reader can't show imagesrc
that point to same image in img
tag get us in trouble it should not happen.this method check each img
tag individually and fix above problem.
img
tag contain at least one src
tagsrc
and fix that if it is not.src
in img
tag@fmartingr my problem with update unit test is each time we generate new id so how should i test that in unit test. how can i get that from run to check at the end of test and if there same or not in final file?
@fmartingr my problem with update unit test is each time we generate new id so how should i test that in unit test. how can i get that from run to check at the end of test and if there same or not in final file?
Maybe you can use a v5 UUID? Instead of creating a random UUID, you can create one derivated from the filename:
// You need a namespace for V5 uuids, we can generate one based on the project URL
// Some namespaces are predefined in the UUIDs RFC, like the one used for URLs
namespace := uuid.NewV5(uuid.NameSpaceURL, "github.com/go-shiori/go-epub")
fileIdentifier := uuid.NewV5(namespace, filename)
More info: RFC 4122 - A Universally Unique IDentifier (UUID) URN Namespace - Some Name Space IDs
@fmartingr https://github.com/go-shiori/go-epub/pull/3/commits/383822551c1bea520aed4c84f986d239d84beded is it what you want?
do you have any idea how should i fix unit test?
@fmartingr 3838225 is it what you want?
do you have any idea how should i fix unit test?
I understood that the problem with the test was that you were creating random UUIDs for images. If you change that to use deterministic UUIDs (uuidv5) you can generate those at the end of your tests and the result will be the same UUID.
I don't see any test in this PR, is there any code I can check?
TestManifestItems
not pass
--- FAIL: TestManifestItems (0.01s)
epub_test.go:824: Package file manifest items don't match
Got:
id="id5857b58e-9d4f-5b23-ad10-d2644bb66023" href="images/filename with space.png" media-type="image/png"></item>
id="id72c19376-e8fb-584c-9dbe-abc447874747" href="images/testfromfile.png" media-type="image/png"></item>
id="id971cabb4-8f27-530a-b2fc-24044798f8a7" href="images/image0005.png" media-type="image/png"></item>
id="ide5dc5e63-8586-5ff2-94d9-037b0543ac24" href="images/gophercolor16x16.png" media-type="image/png"></item>
id="idfbc147f0-b6fd-5f0f-9d96-69f1c80e2ec6" href="images/01filenametest.png" media-type="image/png"></item>
id="nav" href="nav.xhtml" media-type="application/xhtml+xml" properties="nav"></item>
Expected:
id="filenamewithspace.png" href="images/filename with space.png" media-type="image/png"></item>
id="gophercolor16x16.png" href="images/gophercolor16x16.png" media-type="image/png"></item>
id="id01filenametest.png" href="images/01filenametest.png" media-type="image/png"></item>
id="image0005.png" href="images/image0005.png" media-type="image/png"></item>
id="nav" href="nav.xhtml" media-type="application/xhtml+xml" properties="nav"></item>
id="testfromfile.png" href="images/testfromfile.png" media-type="image/png"></item>
CI not run test on this one i don't know why matbe related to #4
you need to modify the test if you modify the ID generation.
Merge latest changes from master into this branch and let's check if theres a problem with #4
now you can see that test in CI but yet i don't have idea how should i fix that.
now you can see that test in CI but yet i don't have idea how should i fix that.
Since you are modifying the way IDs are created, you need to edit the test here: https://github.com/go-shiori/go-epub/blob/main/epub_test.go#L845-L851
That raises the question. Should we store the files with the same filename as the ID? images/{uuid}.extension
?
Since you are modifying the way IDs are created, you need to edit the test here: https://github.com/go-shiori/go-epub/blob/main/epub_test.go#L845-L851
i explain what is the problem to update this section here https://github.com/go-shiori/go-epub/pull/3#issuecomment-1703789380 how should i get UUID when it is generate random UUID in each test run? this is not a fix thing that i can compare at the end.
That raises the question. Should we store the files with the same filename as the ID? images/{uuid}.extension ?
go-epub
automatically change name if it exist same filename before in images/
so we never have same filename in images/
i think it is unnecessary do you have any reason to do that?
i change that unit test:
idUUID
format in manifestnot prefect but it is better than noting.
i change that unit test:
- get any
idUUID
format in manifest- count them and check with what we expected
- be sure nav is exist in manifest
not prefect but it is better than noting.
I'm thinking that since the files are hardcoded, you can hardcode the IDs as well, since they are deterministic.
It would also be interesting to check if a duplicate <item id="x">
is allowed by the EPUB spec.
thanks i review them to impediment them
@fmartingr i back to old method just replace with new id. about duplicate id. when we check that with fix variable if any different show up duplicate similar etc this test faild (current method). do we need check duplication separately?
@fmartingr i back to old method just replace with new id. about duplicate id. when we check that with fix variable if any different show up duplicate similar etc this test faild (current method). do we need check duplication separately?
I was thinking on a separate test that adds the same image twice, to see how the generation behaves and confirm that everything works as expected. Do you think this would be useful?
@fmartingr i back to old method just replace with new id. about duplicate id. when we check that with fix variable if any different show up duplicate similar etc this test faild (current method). do we need check duplication separately?
I was thinking on a separate test that adds the same image twice, to see how the generation behaves and confirm that everything works as expected. Do you think this would be useful?
we have similar to that here
_, err = e.AddImage(testImageFromFileSource, testImageFromFileFilename)
if err != nil {
t.Error(err)
}
_, err = e.AddImage(testImageFromFileSource, "")
if err != nil {
t.Error(err)
}
in this part of unit test we add same image twice once with name and other without name (go epub
automatically create name image0005.png
) that we check exist of that in manifest.
i will add another one same file without naming that. that check generate file name twice
do you want it be a separate test or done that in this test with add images twice.
@fmartingr i add same file twice in https://github.com/go-shiori/go-epub/pull/3/commits/e57fc62a9a41f9330eb5eee9a7375e97c7b219f5 if you think it should be a separate test tell me to write a sperate test for that.
i think add one file twice is good to check go-epub
can handle that (filename and identifire) :+1:
sidenote: go-epub
not allow to save same file name inside epub it generate a new name if user not force filename
when add files to epub or ignore that as i explain (force add file with same filename to epub) #6.
Oh ok I didn't notice that, if we are already handling that scenario I think we're good. Awesome work :D
yes we cover that. if it OK for you please approve this and other PR.
Hi. it is enough or do you want to done this functionality with V5 too?
tempDir := uuid.Must(uuid.NewV4()).String()
EmbededImage add same URL how many times it exist in HTML file that make unnecessary big file and some other problem. this PR download and add image to epub just once and reference all of same url image to the same file inside epub.