Exiv2 / exiv2

Image metadata library and tools
http://www.exiv2.org/
Other
930 stars 281 forks source link

Google Summer of Code 2019 #659

Closed clanmills closed 5 years ago

clanmills commented 5 years ago

The following email has arrived from Google:


Hi all,

If your open source organization is interested in being a mentoring organization for Google Summer of Code 2019 please be sure to submit your organization's application via g.co/gsoc before February 6 at 20:00 UTC.

Organizations chosen for GSoC 2019 will be announced on February 26th.

Best, Google Open Source Programs Office


I am interested in mentoring a student in 2019. I've discussed this by email with Dan and Luis and they enthusiastically support the idea. Due to their commitments, the work to recruit and mentor a student will fall on my shoulders and they will support on a part-time basis.

In 2012 and 2013, Exiv2 was treated as a KDE project. KDE were awarded about 60 "slots" (from 1200) which were split into about 10 "sub-projects". The Chairman of KDE/Graphics/GSoC was @cgilles. Gilles: Are you still in that role and able to help us, or should I apply directly to Google?

I will publish a project description and students will prepare proposals. I expect those proposals to be detailed and involve prototyping work. This is a test to ensure that students are willing to invest effort to ensure they are chosen and add this to their CV. I hope we'll recruit a European student who can visit/stay with me in England for a week at the out-set of their work.

I'd like to call the project something like: "Product Strengthening". Exiv2 is a mature open-source library and used by many applications. We have three current initiatives.

1) The "dots". Exiv2 v0.27 will have quarterly updates in 2019 and 2020. Robin and Luis 2) "Modernisation" for Exiv2 v0.28. Luis and Dan 3) Dealing with users. Robin

All this activity leaves little time to look around and strengthen the code. Topics we have discussed are:

1) Refactor the bash scripts into the python framework 2) Support CTest 3) Extended testing (to be run on the buildserver)

  • I have a copy of the ExifTool test files on the build server
  • I have the RAW images from the Swiss web-site on the build server
  • Trawl camera manufacturers web site for more raw samples 4) Better lens recognition (discussed with @frli8848) 5) Various enhancements to our test suite. For example: more unit tests and localisation 6) Fuzzing https://github.com/google/oss-fuzz 7) Performance testing and optimisation. For example #530 8) Better web documentation concerning makernotes For example #646 9) Static Analysis with cppcheck and PVS-Studio 10) Building for mobile (iOS and Android)

We'll think of more topics between now and May. How much can be done will depend on the ability of the student and the depth to which we dig on any topic. CTest is a well defined topic and is either done or not. However the other tasks are vague. For example running exiv2 on every Exif Tool test file could be a single command to run exiv2 on 6500 files and compare the output with a reference. Or we could go manually through the files and analyse ExifTool vs Exiv2 to discover metadata which we don't report. So, that could a one-day project, or a life-time of discoveries and fixes.

GSoC is 12 weeks. Let's aim for 8 topics in which something useful can be achieved in about one week. We can revisit more interesting/fruitful topics towards the end of the project time.

Why am I interested in this? This will encourage us to do a "heads up" and deal with important, yet non-urgent, topics. The student will have a good experience and Exiv2 will be stronger. This is a win-win for everyone. I don't expect GSoC to impact progress towards Exiv2 v0.28. Working on this may delay a "dot" release.

I am confident that everybody involved in Exiv2 will give the student a great welcome and encouragement. Comments welcome: @fgeek @FreddieWitherden @frli8848 @piponazo @cryptomilk @Kicer86 @nkbj @tbeu @cgilles @D4N @boardhead

nkbj commented 5 years ago

Hi Robin.

I don’t know if you’re aware that there’s a great number of raw files at raw.pixls.ushttp://raw.pixls.us (replacing rawsamples.chhttp://rawsamples.ch)?

Best regards, Niels Kristian Bech Jensen

Den 16. jan. 2019 kl. 14.40 skrev Robin Mills notifications@github.com<mailto:notifications@github.com>:

The following email has arrived from Google:


Hi all,

If your open source organization is interested in being a mentoring organization for Google Summer of Code 2019 please be sure to submit your organization's application via g.co/gsochttp://g.co/gsoc before February 6 at 20:00 UTC.

Organizations chosen for GSoC 2019 will be announced on February 26th.

Best, Google Open Source Programs Office


I am interested in mentoring a student in 2019. I've discussed this by email with Dan and Luis and they enthusiastically support the idea. Due to their commitments, the work to recruit and mentor a student will fall on my shoulders and they will support on a part-time basis.

In 2012 and 2013, Exiv2 was treated as a KDE project. KDE were awarded about 60 "slots" (from 1200) which were split into about 10 "sub-projects". The Chairman of KDE/Graphics/GSoC was @cgilleshttps://github.com/cgilles. Gilles: Are you still in that role and able to help us, or should I apply directly to Google?

I will publish a project description and students will prepare proposals. I expect those proposals to be detailed and involve prototyping work. This is a test to ensure that students are willing to invest effort to ensure they are chosen and add this to their CV. I hope we'll recruit a European student who can visit/stay with me in England for a week at the out-set of their work.

I'd like to call the project something like: "Product Strengthening". Exiv2 is a mature open-source library and used by many applications. We have three current initiatives.

  1. The "dots". Exiv2 v0.27 will have quarterly updates in 2019 and 2020. Robin and Luis
  2. "Modernisation" for Exiv2 v0.28. Luis and Dan
  3. Dealing with users. Robin

All this activity leaves little time to look around and strengthen the code. Topics we have discussed are:

  1. Refactor the bash scripts into the python framework

  2. Support CTest

  3. Extended testing (to be run on the buildserver)

    • I have a copy of the ExifTool test files on the build server
    • I have the RAW images from the Swiss web-site on the build server
    • Trawl camera manufacturers web site for more raw samples
  4. Better lens recognition (discussed with @frli8848https://github.com/frli8848)

  5. Various enhancements to our test suite. For example: more unit tests and localisation

  6. Fuzzing https://github.com/google/oss-fuzz

  7. Performance testing and optimisation. For example #530https://github.com/Exiv2/exiv2/issues/530

  8. Better web documentation concerning makernotes For example #646https://github.com/Exiv2/exiv2/issues/646

We'll think of more topics between now and May. How much can be done will depend on the ability of the student and the depth to which we dig on any topic. CTest is a well defined topic and is either done or not. However the other tasks are vague. For example running exiv2 on all of the Exif Tool test files could be a single command to run exiv2 on 1000 files and compare the output with a reference. Or we could go manually through the files and analyse ExifTool vs Exiv2 to discover metadata which we don't report. So, that could a one-day project, or a life-time of discoveries and fixes.

GSoC is 12 weeks. Let's aim for 8 topics in which something useful can be achieved in about one week. We can revisit more interesting/fruitful topics towards the end of the project time.

Why am I interested in this? This will encourage us to do a "heads up" and deal with non-urgent and important topics. The student will have a good experience and Exiv2 will be stronger. This is a win-win for everyone. I don't expect GSoC to impact progress towards Exiv2 v0.28. Working on this may delay a "dot" release.

I am confident that everybody involved in Exiv2 will give the student a great welcome and encouragement. Comments welcome: @fgeekhttps://github.com/fgeek @FreddieWitherdenhttps://github.com/FreddieWitherden @frli8848https://github.com/frli8848 @piponazohttps://github.com/piponazo @cryptomilkhttps://github.com/cryptomilk @Kicer86https://github.com/Kicer86 @nkbjhttps://github.com/nkbj @tbeuhttps://github.com/tbeu @cgilleshttps://github.com/cgilles @D4Nhttps://github.com/D4N @boardheadhttps://github.com/boardhead

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/Exiv2/exiv2/issues/659, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AD9anXoSJteMeUpPGlo2fqk-1uAgS0u8ks5vDyu9gaJpZM4aDFnS.

clanmills commented 5 years ago

Thanks, Niels. Happy New Year. Alison and I will go to Finland in July and have been talking about Norway and Denmark en route to visit you and Freddy.

I have a copy of the files from rawsamples.ch I was aware that pixls.us were doing something about collecting raw files. For sure we can take advantage of those resources.

I've done a lot of work this week on #646 and reached new understanding of the inspirational Tiff Reader code. So, I think I've been tramping on ground with which you are familiar. Very impressed with how the binary decoders work.

nkbj commented 5 years ago

Yes. There can be some heavy bit programming involved in the decoders. :-D

I hope we can find time på meet and greet but it might be troublesome in July. I am working the first week and then my wife and I are going to Borneo for some jungle and wildlife experiences (and photography).

Best regards, Niels Kristian Bech Jensen


Fra: Robin Mills notifications@github.com Sendt: 16. januar 2019 17:50 Til: Exiv2/exiv2 Cc: Niels Kristian Bech Jensen; Mention Emne: Re: [Exiv2/exiv2] Google Summer of Code 2019 (#659)

Thanks, Niels. Happy New Year. Alison and I will go to Finland in July and have been talking about Norway and Denmark en route to visit you and Freddy.

I have a copy of the files from rawsamples.ch I was aware that pixls.us were doing something about collecting raw files. For sure we can take advantage of those resources.

I've done a lot of work this week on #646https://github.com/Exiv2/exiv2/issues/646 and reached new understanding of the inspirational Tiff Reader code. So, I think I've been tramping on ground with which you are familiar. Very impressed with how the binary decoders work.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/Exiv2/exiv2/issues/659#issuecomment-454853032, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AD9andgiQ1OOEIhM1R2IEGIQtL74_Ij2ks5vD1hsgaJpZM4aDFnS.

D4N commented 5 years ago

As already said via email, I think that this definitely has potential.

I would suggest that we focus the potential topics at first on testing:

Depending on how well that goes, the student could also tackle the porting of the test suite a bit and maybe some fuzzing with libfuzzer. Although the problem with the latter is that it requires quite good knowledge of the exiv2 API to actually be useful.

D4N commented 5 years ago

I have another topic for the Todo list: mutation testing. This is not really a technique for testing your binary, but rather the test suite: the mutation tester mutates the program itself and runs the testsuite, looking for mutants that pass it, although they shouldn't.

There already exist frameworks for this, one of them is mull, which looks quite promising to me. The student could evaluate this framework, try to integrate it into exiv2's test suite and figure out a way how to fit it into our CI. It probably won't fit into the ordinary CI run, but maybe a weekly run?

piponazo commented 5 years ago

We did not have luck with the GSoC application 😢 ... I'll close this issue for the sake of maintaining the list of issues as clean as possible.