Closed justinmclean closed 1 year ago
Let's clear the license for the files we own first. I think it is OK to have some files under compatibile licenses for a ASF project. You just need to mention them in the NOTICE file. And there is another possible solution is to rewrite these files so we can change the license. Anyway, this depends on the number of files we can not change license.
Thanks.
Let's clear the license for the files we own first. I think it is OK to have some files under compatibile licenses for a ASF project. You just need to mention them in the NOTICE file. And there is another possible solution is to rewrite these files so we can change the license. Anyway, this depends on the number of files we can not change license.
I don't think anyone has committed to do that work. Adam and Peter have, I guess, but they don't apparently have the bandwidth required to do that effectively.
I think that even the first baby steps would require a substantial, committed, full time effort.
I think @xiaoxiang781216 has already found someone wish to help here? But anyway, we need at least a committer to review the work...
I've been writing some scripts which convert the output of git log (over a given file) into JSON format, to obtain metadata for each revision of the file. The final JSON contains (among other information): commit author, commit message and blob hash for the file. I then started writing a python script to parse the JSON and extract (using regular expressions) authors from commit message and file header, in each commit. It is working nicely so far. The final goal would be to determine if a given file passes the previously discussed checks for the easy cases that can be moved to Apache header. The python script could also be used to make the header change and commit the result.
I will work a bit more on this and open a draft PR (to add the script inside tools/).
I've been writing some scripts which convert the output of git log (over a given file) into JSON format, to obtain metadata for each revision of the file. The final JSON contains (among other information): commit author, commit message and blob hash for the file.
People have been using Fossology to get historical information: https://www.fossology.org/
Yeah, life intervened and I haven't been able to get back to this. I have less time for it than I thought.
@PeterBee97 made some progress in parsing out the list of contributors from the Git log messages. I will see if I can take his list and see if I can get a list of files and also number of lines of code for each contribution... anyway that seems to be the next steps:
There are several other approaches. This is just the one that seems most straightforward to me. If anyone wants to help, we could use help with:
Please see #1834
I know @PeterBee97 started some of this work but to be honest it was quite difficult for me to take advantage of those, considering it was based on sqlite databases. I chose JSON format since it is quite easy to read and parse with different programming languages.
Please see #1834
I know @PeterBee97 started some of this work but to be honest it was quite difficult for me to take advantage of those, considering it was based on sqlite databases. I chose JSON format since it is quite easy to read and parse with different programming languages.
I have to be in favor of anything that makes forward progress.
@patacongo Re: anything that makes forward progress, me too.
@v01d yes, text-based json or csv/tsv formats would be great. The scripts in #1834 look cool. Maybe we combine them into one python script with the sh module. I'll try them out.
@v01d yes, text-based json or csv/tsv formats would be great. The scripts in #1834 look cool. Maybe we combine them into one python script with the sh module. I'll try them out.
There's quite a bit of escaping going on in the bash script, so embedding it inside python would probably require some work. Not sure if it is worth it, but we can think about it.
Comment moved to #1834
Comment moved to #1834
Oops, thought I was on the PR, I'll move the comments there
@justinmclean @adamfeuer
Hi guys, we made some progress and post it here. https://github.com/apache/incubator-nuttx/issues/1954
Basically, we collected the author/company list which have not signed the agreement. So the next step is to contact them via email and get them sign the agreement.
My questions are the following:
ICLAs are emailed to secretary@apache.org see https://www.apache.org/licenses/contributor-agreements.html
@justinmclean Thanks!One more question, how would you normally contact companies to get their SGA signed? Do you contact people you know from the company to get introduced? What department is normally responsible for this?
For other authors, shall we just auto send email to contact them?
@justinmclean One more question, shall we ask authors to send ICLA directly to secretary@apache.org? Will someone from Apache Secretary process the mails and update the list and sync with us on the author list?
I think this issue can be closed:
If there is something I am missing please just re-open.
All files developed at the ASF need to have an ASF header [1], 3rd party headers for the most part need to be retained [2]