IP Clearance - Githubissues

justinmclean commented 4 years ago

All files developed at the ASF need to have an ASF header [1], 3rd party headers for the most part need to be retained [2]

Apache9 commented 4 years ago

Let's clear the license for the files we own first. I think it is OK to have some files under compatibile licenses for a ASF project. You just need to mention them in the NOTICE file. And there is another possible solution is to rewrite these files so we can change the license. Anyway, this depends on the number of files we can not change license.

Thanks.

patacongo commented 4 years ago

Let's clear the license for the files we own first. I think it is OK to have some files under compatibile licenses for a ASF project. You just need to mention them in the NOTICE file. And there is another possible solution is to rewrite these files so we can change the license. Anyway, this depends on the number of files we can not change license.

I don't think anyone has committed to do that work. Adam and Peter have, I guess, but they don't apparently have the bandwidth required to do that effectively.

I think that even the first baby steps would require a substantial, committed, full time effort.

Apache9 commented 4 years ago

I think @xiaoxiang781216 has already found someone wish to help here? But anyway, we need at least a committer to review the work...

protobits commented 4 years ago

I've been writing some scripts which convert the output of git log (over a given file) into JSON format, to obtain metadata for each revision of the file. The final JSON contains (among other information): commit author, commit message and blob hash for the file. I then started writing a python script to parse the JSON and extract (using regular expressions) authors from commit message and file header, in each commit. It is working nicely so far. The final goal would be to determine if a given file passes the previously discussed checks for the easy cases that can be moved to Apache header. The python script could also be used to make the header change and commit the result.

I will work a bit more on this and open a draft PR (to add the script inside tools/).

patacongo commented 4 years ago

I've been writing some scripts which convert the output of git log (over a given file) into JSON format, to obtain metadata for each revision of the file. The final JSON contains (among other information): commit author, commit message and blob hash for the file.

People have been using Fossology to get historical information: https://www.fossology.org/

adamfeuer commented 4 years ago

Yeah, life intervened and I haven't been able to get back to this. I have less time for it than I thought.

@PeterBee97 made some progress in parsing out the list of contributors from the Git log messages. I will see if I can take his list and see if I can get a list of files and also number of lines of code for each contribution... anyway that seems to be the next steps:

get a list of people who contributed
get a list of the commits they were involved with
work out how many lines of code per person are involved
sort the list largest to smallest – this will give us an idea of how big the job is
try contacting people with the n largest contributions

There are several other approaches. This is just the one that seems most straightforward to me. If anyone wants to help, we could use help with:

writing a script that could take a list of commits and output the contribution size in lines
getting a list of names and commits from the git log (Peter's scripts are this, or very close I think)

protobits commented 4 years ago

Please see #1834

I know @PeterBee97 started some of this work but to be honest it was quite difficult for me to take advantage of those, considering it was based on sqlite databases. I chose JSON format since it is quite easy to read and parse with different programming languages.

patacongo commented 4 years ago

Please see #1834

I know @PeterBee97 started some of this work but to be honest it was quite difficult for me to take advantage of those, considering it was based on sqlite databases. I chose JSON format since it is quite easy to read and parse with different programming languages.

I have to be in favor of anything that makes forward progress.

adamfeuer commented 4 years ago

@patacongo Re: anything that makes forward progress, me too.

@v01d yes, text-based json or csv/tsv formats would be great. The scripts in #1834 look cool. Maybe we combine them into one python script with the sh module. I'll try them out.

protobits commented 4 years ago

@v01d yes, text-based json or csv/tsv formats would be great. The scripts in #1834 look cool. Maybe we combine them into one python script with the sh module. I'll try them out.

There's quite a bit of escaping going on in the bash script, so embedding it inside python would probably require some work. Not sure if it is worth it, but we can think about it.

protobits commented 4 years ago

Comment moved to #1834

protobits commented 4 years ago

Comment moved to #1834

protobits commented 4 years ago

Oops, thought I was on the PR, I'll move the comments there

yy-gu commented 4 years ago

@justinmclean @adamfeuer

Hi guys, we made some progress and post it here. https://github.com/apache/incubator-nuttx/issues/1954

Basically, we collected the author/company list which have not signed the agreement. So the next step is to contact them via email and get them sign the agreement.

My questions are the following:

Is there an email template for contacting the authors?
Where do we return the signed ICLA to? Is there somebody from Apache Foundation to collect and verify them?

justinmclean commented 4 years ago

ICLAs are emailed to secretary@apache.org see https://www.apache.org/licenses/contributor-agreements.html

yy-gu commented 4 years ago

@justinmclean Thanks！One more question, how would you normally contact companies to get their SGA signed? Do you contact people you know from the company to get introduced? What department is normally responsible for this?

For other authors, shall we just auto send email to contact them?

yy-gu commented 4 years ago

@justinmclean One more question, shall we ask authors to send ICLA directly to secretary@apache.org? Will someone from Apache Secretary process the mails and update the list and sync with us on the author list?

patacongo commented 1 year ago

I think this issue can be closed:

It is inactive. There have been no comments since 2020
NuttX has since graduated to a TLP so all IP clearance issues must have been resolved.

If there is something I am missing please just re-open.

apache / nuttx

IP Clearance #128