RhettZhou / pyEDS

pyEDS provides a systematic data processing approach to produce high-quality energy dispersive X-ray Spectroscopy (EDS) data analysis for material science study. Here we use the non-rigid registration (NRR) to reduce image distortion and non-local principal component analysis (NLPCA) to increase signal-to-noise ratio.
GNU General Public License v3.0
3 stars 0 forks source link

Consider not copying files and make a pull request #1

Closed din14970 closed 1 year ago

din14970 commented 1 year ago

Rhett, you literally copied files from my pymatchseries project into this repo. This direct copying is not necessarily forbidden but taking these files and publishing them as your own under a different name is definitely not the way to do things. If you take files at least you should give credit otherwise you violate copyright.

If a project doesn't fit your needs and you need to make changes, the proper way to do it is to fork and make a pull request to the original project. If your pull request is not accepted for whatever reason, you can continue developing the fork. At least in this case, the fact that it originates from another project is clear from the git history. Just taking out individual files, editing them, and putting them in your own project is frowned upon.

RhettZhou commented 1 year ago

Hi, Niels,

Please be aware that your work has been carefully cited in the introduction!!!

Reference ... [5] Niels Cautaerts, pymatchSeries:10.5281/zenodo.4506873

During the installation step, I also added your code via "pip install pyMatchSeries". This is from your code. I did not claim I developed the "pyMatchSeries" code.

I am not sure why you behalve so impolite. Maybe better read what is included in the introduction before you send such kind of a message!

GitHub is a public platform! Please watch out for what you said!

din14970 commented 1 year ago

This is not what it is about Rhett, and my message is very polite. What it is about is that your repo contains files in python/pyeds which were directly copied from my repo. You add a reference to my repo in the intro but this does not make it clear that these files were not created by you. In fact, you putting it in a differently named folder makes it appear that this is your original work. You also don't ask users to cite me, you only ask them to cite your upcoming paper.

You may copy code and from my repo, but you may not violate the terms of the GPL license, which you are clearly doing in this repo, specifically on two points:

The relevant snippets from the license:

4. Conveying Verbatim Copies.

  You may convey verbatim copies of the Program's source code as you
receive it, in any medium, provided that you conspicuously and
appropriately publish on each copy an appropriate copyright notice;
keep intact all notices stating that this License and any
non-permissive terms added in accord with section 7 apply to the code;
keep intact all notices of the absence of any warranty; and give all
recipients a copy of this License along with the Program.

  You may charge any price or no price for each copy that you convey,
and you may offer support or warranty protection for a fee.

  5. Conveying Modified Source Versions.

  You may convey a work based on the Program, or the modifications to
produce it from the Program, in the form of source code under the
terms of section 4, provided that you also meet all of these conditions:

    a) The work must carry prominent notices stating that you modified
    it, and giving a relevant date.

    b) The work must carry prominent notices stating that it is
    released under this License and any conditions added under section
    7.  This requirement modifies the requirement in section 4 to
    "keep intact all notices".

    c) You must license the entire work, as a whole, under this
    License to anyone who comes into possession of a copy.  This
    License will therefore apply, along with any applicable section 7
    additional terms, to the whole of the work, and all its parts,
    regardless of how they are packaged.  This License gives no
    permission to license the work in any other way, but it does not
    invalidate such permission if you have separately received it.

    d) If the work has interactive user interfaces, each must display
    Appropriate Legal Notices; however, if the Program has interactive
    interfaces that do not display Appropriate Legal Notices, your
    work need not make them do so.

  A compilation of a covered work with other separate and independent
works, which are not by their nature extensions of the covered work,
and which are not combined with it such as to form a larger program,
in or on a volume of a storage or distribution medium, is called an
"aggregate" if the compilation and its resulting copyright are not
used to limit the access or legal rights of the compilation's users
beyond what the individual works permit.  Inclusion of a covered work
in an aggregate does not cause this License to apply to the other
parts of the aggregate.

The GPL is a legally binding contract which you sign up to when you use the licensed software. You violate this contract on two levels.

If you ask for pymatchseries to be a dependency, I don't understand why you need to copy the files into your own repo. If there are errors or the code doesn't fit your needs, you can make a fork or a pull request. Note that even if you didn't copy the files and only use pymatchseries as a dependency, you still must license your repo under GPL.

The fact that it is a public platform is in fact ideal in cases like this. At least I raise my concern and claim my copyright publicly.

RhettZhou commented 1 year ago

Hello Niels,

No, I don't think the title and the content have anything to do with politeness.

First, the title is completely misguided. When I read the title, I feel it is an insult to me and destruction of my reputation. I have mentioned it before. Your work has been carefully cited. You did not mention this in your first comment. It looks like I did not cite your work to create this repo. In fact, I cited you in the original submission. This is to give you credit.

Second, I found dirty words like "sh--" in your comments. I do not think this is a gentleman's behavior.

Third, the GPL 3.0 license is under my repo. I am not sure why you could not see it. The statement "At least you should add a GPL 3 license to your repo" from you is not true. I don't want to use any of the codes commercially. The only thing I want is to make the source code available for the benefit of the community. I don't want the false statement to leave a bad impression on me in the community. Please correct the statement if you can. Indeed, the two statements in the title are both wrong.

Fourth, your code is used during the installation phase. This is the second place your merit is mentioned. The title is completely misleading and destroys my reputation!!!

Fifth: The reason I also uploaded a separate file is that 1. I need to change the I/O settings so I can easily run my pyEDS Jupyter notebook. 2. there are some small changes to correct some issues (in your pymatchSeries code). To make this clear, I have added the "matchseries.py" code, the "config_tools.py" code, and the "default_parameters.param" code with the note: "This code is written by Dr. Niels Cautaerts, pymatchSeries:10.5281/zenodo.4506873". Please refer to the recent change I made. I hope this is understandable to you and the community. I have no interest in taking anything away from you. Again, your pymatchSeries was already required during the installation.

Sixth, I must say that the original work for matchSeries was by mathematician Benjamin Berkles "B. Berkels et al, Ultramicroscopy 138, 46 (2014)." This is the core of the work. I appreciate you doing the I/O for the Python interface that can benefit the community. The same is true for my contribution. What I'm doing is creating a user-friendly Jupyty notebook framework that the community can use directly. I have also contributed to solving problems such as data from non-square scans, set up for running on Linux, and step-by-step analysis for EDX. Some of these are related to the "io.py" code and the "io_utils" code. Even more important is the subsequent denoising process. It's worth noting that we don't keep reinventing things. The most important thing is to provide something that advances our scientific understanding of the materials, and the community can use the code directly for their scientific research.

Seventh, to make it even clearer, I am changing the introduction to add a note about citations. "Please also cite the above references as they are the source on which the current pyEDS was built."

din14970 commented 1 year ago

Thanks for the changes and giving credit in the files. Sorry I didn't see the licence. This was not correct, I apologize for this judgement!

The point stands that if you wanted to make changes/extend my code, the better way to do it is not copying the files, but creating a fork. With a fork, my contribution is automatically clear in the git history, and your changes are explicit as well. Then there is absolutely no discussion who made what. Copying the files into a different repository and throwing away the git history is in general in the community frowned upon, because it makes it look as if you made everything.

I agree it is best to make something useful for the community. Which is exactly why duplication and fragmentation is harmful. The best software packages are those that build a community and are continuously maintained and improved.

And I also agree that matchseries is the real backbone, which is why I try to make this very very explicitly clear in my readme and tell people to cite those papers. I also don't copy the matchseries code in my repo.