per1234 / inolibbuglist

Check a list of Arduino libraries for common problems
MIT License
7 stars 3 forks source link

False positive #10

Closed 0x2b3bfa0 closed 5 years ago

0x2b3bfa0 commented 5 years ago

I see that one of my projects (0x2b3bfa0/2080-controller) appears listed as not a library. The issue is that it never pretended to be an Arduino library, but a PlatformIO project.

Could I know the reason behind this classification?

per1234 commented 5 years ago

The list of libraries is automatically generated by this project: https://github.com/per1234/inoliblist One of the sources of the list is the arduino GitHub topic, which your repository uses. Of course, there are many projects which contain those keywords but are not Arduino libraries. The script attempts to verify that the repository actually contains an Arduino library before adding it to the list. This is done by examining the contents of the repository. The script looks in your lib folder and discovers that it does indeed contain a valid Arduino library, so it adds it to the list. I don't see any good way to automatically determine that was only a dependency of your project.

The library list generation script is not perfect, so I manually add all repositories that are not Arduino libraries to a blacklist, which is excluded from the inolibbuglist scan. The plan is to eventually move parts of the repository blacklist (including the "Not a library" category) over to the inoliblist project so the blacklisted repositories never end up on the list in the first place.

The issue is that it never pretended to be an Arduino library

That's not an issue. As you say, your project is not an Arduino library. Your project is blacklisted as not being an Arduino library. All is right with the world. Carry on.

0x2b3bfa0 commented 5 years ago

I didn't understand well the project blacklisting criteria. Sorry.

In order to repair partially my mistake, I would like to suggest a way of knowing whether a project is an Arduino library or not: would not be easier to analyze a set of well-known libraries to find common directory structures and use that to classify new repositories with a thresholded confidence percentage?

per1234 commented 5 years ago

inoliblist already does that. The problem is that your repository does contain a valid Arduino library. So my verification code did its job perfectly. The code couldn't know that library was not really the project, but just a dependency of the project. I don't see any way that could be detected automatically. Consider the case of a library project where the library was stored under lib, and perhaps an example program was under src. I have seen this before Even though that is not an ideal structure for compatibility with the Arduino IDE, I would still consider it to be a valid Arduino library and want it on the list.

So the only way I can see to deal with a false positive from the inoliblist script like the one that put your repository on the list is to manually add them to the blacklist as I discover them. The fact is that an automatically generated library list will never be perfect. I have made some earnest efforts to verify the libraries, but it still can't compete with a human brain for identifying which should be on the list and which should not. My intention was never for the list to be perfect. My goal is for the script to find as many of the Arduino libraries on GitHub as possible, even if that means that a small number of non-libraries end up on the list as well.

I actually plan to increase the search depth of the library verification code. I think it currently only goes down one or two folder levels and so it misses libraries that are more deeply nested for whatever reason. My idea is that it should search even deeper than that for a legitimate library when the parent folders are empty (or contain only files on the whitelist to be ignored, like README.md or readme.txt).

I certainly am open to any ideas on how the library verification system can be improved. My goal with the inoliblist and inolibbuglist projects was to automate everything as much as possible, so the manual blacklisting system is not at all ideal.