Multiple STDF files - Githubissues

sbazili commented 2 years ago

Hi, Is it possible to support multiple STDF files loading ? If yes, a few features can be added: 1) Merge all STDF's 2) Merge / Override results with retest data for specific units (by X / Y) location. 3) Kappa / Correlation between 2 STDF's - parametric and bin

noonchen commented 2 years ago

To be honest, this is what I want to do long time ago, ever since I first develop this app.

But this is harder than you think, and I don't think I can do it on my own, otherwise it would already shipped in previous releases.

I'll mark this as a feature request, but you won't see this func coming in a short time, I suggest you go for a commercial software for these complex tasks.

noonchen commented 2 years ago

Hi @sbazili ,

I am now working on multi-file feature. The basic reading and parsing is in progress and no blocking issues have been found.

As for these features you mentioned, I am not experienced with commercial software so I might need some inputs.

What is the difference between merging stdf files and opening multiple files at the same time?
Is override the same as superseded in stdf spec?
The equation or references for kappa / correlation calculation.

sbazili commented 2 years ago

Hi @noonchen,

1.a) opening multiple files at the same time

here we're relating every file as a different source and i guess the easy way is to present it in a new window in the app

1.b) merging stdf files

here we're relating all files as the same source, and it's including 2 sub-cases:
[ ] merge (contact) all results, without distinction if there are retest or different units
[ ] merge (override) the latest results from the retest, user will need to specify which file is the 1st, which is the 2nd retest and so on

yes, please see 1.b
kappa is defined as the % of mismatch, which can be presented in graphical 2D table with colors
- [ ] it can be a comparison of the same units with OLD and NEW TP, or any other parameter like TIU, Tester and so on.

Here is an example of 100 units perfect kappa report (kappa 100%), how to read it: The same 50 units get BIN1 in OLD and NEW TP's The same 50 units get BIN2 in OLD and NEW TP's

Here is an example of 2 units with BIN switch (kappa 98%), how to read it: 1 unit gets BIN4 in OLD TP, and BIN1 in the new TP 1 unit gets BIN1 in OLD TP, and BIN3 in the new TP

noonchen commented 2 years ago

@sbazili Thanks for such valuable information!

What I am developing now is more like a "compare" mode, in which users can see the differences of a same test item between multiple files.

Merging is different from this mode, since I should process multiple files and treat them as a single file. This is a complex task because I need to redesign database and file loading logic, it's not too hard to implement but maybe in future releases.

Kappa correlation fits in the scope of "compare" mode, I still have some doubts:

How to identify a unit is the same in multiple files? If use part id, how do we dealt with superseded duts? If use (x,y), what if there are multiple wafers in a single file?
From my understanding, the total dut counts, HBIN and SBIN must be the same in different files, is it correct? Can commercial software calculate kappa on total different files?

sbazili commented 2 years ago

Hi @noonchen,

Regarding your question:

How to identify a unit is the same in multiple files? If use part id, how do we dealt with superseded duts? If use (x,y), what if there are multiple wafers in a single file?

Here is my answer: In order to support the part ID correctly, I will suggest using custom logic that can be defined by the user (1-time setting). For example, drop-down list for DTR / GDR / PTR / PRR, and then define by regex how to parse the part ID. It can work in "live" / preview mode, till the user see it's parsing correctly on a single file.

I will give a few examples below that I saw how different companies treat the part ID inside STDF. Probably there are more options, I think for every field (SITE, X, Y, Wafer, Lot), the user should define from where to read it for every part ID. In this way, we cover all possible options.

[ ] Single DTR record with the custom format, with SITE, X, Y, Wafer, Lot
[ ] Single PRR record with the custom format, with X_coord, Y_coord, and other fields can be in 'PART_TXT' of the same record or any other record defined here (DTR / GDR / PTR)
[ ] Single GDR record with the custom format, with SITE, X, Y, Wafer, Lot
[ ] Multiple PTR records for each field: SITE, X, Y, Wafer, Lot

I think if you can support such a feature, it will improve the analysis and open many options.

sbazili commented 2 years ago

Hi @noonchen, Regarding your question:

From my understanding, the total dut counts, HBIN and SBIN must be the same in different files, is it correct? Can commercial software calculate kappa on total different files?

Here is my answer: Yes, if comparing 2 files in kappa mode, the number of DUTs should be the same, otherwise, it can't compare correctly. We also need a way to identify the same PART_ID, so the feature in the answer above is mandatory for that.

noonchen commented 2 years ago

Hi @sbazili,

Thanks for the detailed description.

As for your suggested options:

Single DTR record with the custom format, with SITE, X, Y, Wafer, Lot Single GDR record with the custom format, with SITE, X, Y, Wafer, Lot

I remember what you regex suggestion in #82 , but my question remains the same: according to spec, these 2 records can appear in anywhere in any order and any count WITHOUT any head and site info.

For me they are just two records with text in it, even if the text contains info like site, xy, how does it help me distinguish a PRR in this file is the same dut in another file? As a developer this behavior is undefined.

I don't know why other companies can associate DTR/GDR with other PRRs, maybe they determined the DTR content and location so that they can read and process it.

Single PRR record with the custom format, with X_coord, Y_coord,

This one looks promising, I think I can check DUT identity by PART_ID, or using (X_COORD, Y_COORD) if wafer is detected.

Multiple PTR records for each field: SITE, X, Y, Wafer, Lot

PTR is just a parameter test, not sure how a PTR can be used to determine a DUT is the same.

sbazili commented 2 years ago

Hi @noonchen, Please see my reply inline.

As for your suggested options:

Single DTR record with the custom format, with SITE, X, Y, Wafer, Lot Single GDR record with the custom format, with SITE, X, Y, Wafer, Lot

I remember what you regex suggestion in #82 , but my question remains the same: according to spec, these 2 records can appear in anywhere in any order and any count WITHOUT any head and site info.

[ ] As far as I know, this is also what I'm doing in my internal scripts, the PRR record declares the end of the tests of the unit, hence all DTR and GDR between two PRR's belong to the specific unit, I also added site info to make sure it matches.

I don't know why other companies can associate DTR/GDR with other PRRs, maybe they determined the DTR content and location so that they can read and process it.

[ ] Such a complex structure is just an example use case that is theoretically possible, if the tool will provide the option to define from where to take the X/Y/Wafer/Lot - even this case will be covered, anyway, usually companies try to define all records in the same type.

Single PRR record with the custom format, with X_coord, Y_coord, This one looks promising, I think I can check DUT identity by PART_ID, or using (X_COORD, Y_COORD) if wafer is detected.

[ ] Yes, this is the simple case, maybe it can be the default. but not all STDF's has this data, it depends on the file source and how it run, which environment (SORT, FT) and the execution tool. this is why the user will need other options to select from the default one. some companies are also using the barcode (2DID) instead of X/Y/Wafer/Lot, which also can be in DTR/GDR/PTR.

Multiple PTR records for each field: SITE, X, Y, Wafer, Lot PTR is just a parameter test, not sure how a PTR can be used to determine a DUT is the same.

[ ] Example: Few PTRs inside STDF for every unit, and the user selects (1 time) the desired format from where to extract the fields. X_COORD 1 Y_COORD 2 WAFER 3 LOT abcd

noonchen commented 2 years ago

Hi @sbazili ,

hence all DTR and GDR between two PRR's belong to the specific unit

Your use case cannot be applied to stdf files with the structure shown below, where PIRs and PRRs are occurred in a group. We just simply cannot know which DUTs own the DTR(s) within, this is completely arbitrary. DTRs and GDRs cannot be a DUT identifier in a general usage.

Few PTRs inside STDF for every unit, and the user selects (1 time) the desired format from where to extract the fields. X_COORD 1 Y_COORD 2 WAFER 3 LOT abcd

Are you sure about this? PTR doesn't contain those fields.

sbazili commented 2 years ago

Hi @sbazili ,

hence all DTR and GDR between two PRR's belong to the specific unit

Your use case cannot be applied to stdf files with the structure shown below, where PIRs and PRRs are occurred in a group. We just simply cannot know which DUTs own the DTR(s) within, this is completely arbitrary. DTRs and GDRs cannot be a DUT identifier in a general usage.

[ ] Looks like a very complex case, anyway in such a case this company probably has a different way to distinguish between DUT's. And here we come with a few suggested formats, every user will choose what is best for him. In our case we use only PTR / DTR / GDR, hence all the records of the same DUT is inside the 2 PRR's.

Few PTRs inside STDF for every unit, and the user selects (1 time) the desired format from where to extract the fields. X_COORD 1 Y_COORD 2 WAFER 3 LOT abcd

Are you sure about this? PTR doesn't contain those fields.

[ ] What I mean is that the user will select (1-time setting) the "PTR" names that represent the X/Y/Wafer/Lot. The names was only an example, it can be different for different products.

noonchen commented 2 years ago

Hi @sbazili ,

thanks for your time and explanations.

I need sometime to develop the initial version of kappa correlation, we will discuss other options later if it's requested by other users.

noonchen / STDF-Viewer

Multiple STDF files #87